© 
JUL 11 1948 


THE ANNALS 
of 
MATHEMATICAL 


STATISTICS 


(FOUNDED BY H. C. CARVER) 


THe OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 


Contents 


Statistical Decision Functions. ABRAHAM WALD 
The Multiplicative Process. Richarp OTTER 


Application of the Radon-Nikodym Theorem to the Theory of Suffi- 
cient Statistics. PauL R. Haumos anp L. J. SAVAGE 


On Designing Single Sampling Inspection Plans. Frank E. Gruspss 242 


On the Range-midrange Test and Some Tests with Bounded Signifi- 
cance Levels. Joun E. WALSH 


Asymptotic Studentization in Testing of Hypotheses. Herman 
CHERNOFF 


Some Low Moments of Order Statistics. H. J. Gopwin 
On a Theorem of Hsu and Robbins. P. Erpés 
Notes: 


Brownian Motion on the Surface of the 3-Sphere. Késaxu 


On the Strong Stability of a Sequence of Events. Aryren 
DvoRETZKY 


A Note on Weighing Design. K. S. BANERJEE 
Control Chart for Largest and Smallest Values. Joun M. 


Sufficiency, Truncation and Selection. Joan W. Tukey 
On a Probability Distribution, Max A. WoopBury 
A Graphical Determination of Sample Size for Wilks’ Tolerance 
Limits. Z. W. Brrnspaum AND H. 8. ZucKERMAN 
Abstracts of Papers 
News and Notices 
Report of the New York Meeting of the Institute 
Constitution and By-Laws of the Institute 


Vol. XX, No. 2 — June, 1949 





THE ANNALS 
OF MATHEMATICAL STATISTICS 


EDITED BY 


S. 8. WILKS, Editor 


M. S. BARTLETT HARALD CRAMER J. NEYMAN 
WILLIAM G. COCHRAN W. EDWARDS DEMING WALTER A. SHEWHART | 
ALLEN T. CRAIG J. L. DOOB JOHN W. TUKEY q 
: W. FELLER A. WALD 
HAROLD HOTELLING 


C. C. CRAIG ? 
WITH THE COOPERATION OF 


T. W. ANDERSON, JR. CHURCHILL EISENHART H. B. Mann 

Davip BLACKWELL M. A. GIrRsHICK ALEXANDER M. Moop 

J. H. Curtiss Paut R. Hautmos FREDERICK MosTELLER 

J. F. Daty Paut G. Hore. H. E. Rossins 

Haroup F. DopGsE Marx Kac Henry ScHerré 

Paut 8S. Dwyer E. L. LEHMANN JACOB WoOLFOWITzZ 
Wi.tiram G. Mapow 


The ANNALS oF MaTHEMaTIcAL Statistics is published quarterly by the * 
Institute of Mathematical Statistics, Mt. Royal & Guilford Aves., Baltimore 2, 7 
Md. Subscriptions, renewals, orders for back numbers and other business com- = 
munications should be sent to the ANNALS OF MATHEMATICAL Statistics, Mt. 7 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Insti- 
tute of Mathematical Statistics, P.S. Dwyer, 116 Rackham Hall, University of 
Michigan, Ann Arbor, Mich. 


Changes in mailing address which are to become effective for a given issue 
should be reported to the Secretary on or before the 15th of the month preceding 
the month of that issue. The months of issue are March, June, September and 
December. 


Manuscripts for publication in the ANNALS OF MATHEMATICAL STATISTICS 
should be sent to 8S. S. Wilks, Fine Hall, Princeton, New Jersey. Manuscripts 
should be typewritten double-spaced with wide margins, and the original copy 
should be submitted. Footnotes should be reduced to a minimum and whenever 
possible replaced by a bibliography at the end of the paper; formulae in foot- 
notes should be avoided. Figures, charts, and diagrams should be drawn on 
plain white paper or tracing cloth in black India ink twice the size they are to 
be printed. Authors are requested to keep in mind typographical difficulties 
of complicated mathematical formulae. 


Authors will ordinarily receive only galley proofs. Fifty reprints without 
covers will be furnished free. Additional reprints and covers furnished at cost. 


The subscription price for the ANNALS is $8.00 inside the Western Hemi- 
sphere and $5.00 elsewhere. Single copies $3.00. Back numbers are available 
at $8.00 per volume or $3.00 per single issue. 


CoMPOSED AND PRINTED AT THE 
WAVERLY PRESS Inc. 
BattTimorE, Mp., U. S. A. 


Entered as second-class matter at the Post Office at Baltimore, Maryland, under the act of March 3, 1879. 








a 


”- 


eid 





Ra 


* 





STATISTICAL DECISION FUNCTIONS 


By ABRAHAM WALD! 


Columbia University 


Introduction and summary. The foundations of a general theory of statistical 
decision functions, including the classical non-sequential case as well as the 
sequential case, was discussed by the author in a previous publication 
[3]. Several assumptions made in [3] appear, however, to be unnecessarily re- 
strictive (see conditions 1-7, pp. 297 in [3]). These assumptions, moreover, 
are not always fulfilled for statistical problems in their conventional form. In 
this paper the main results of [8], as well as several new results, are obtained 
from a considerably weaker set of conditions which are fulfilled for most of the 
statistical problems treated in the literature. It seemed necessary to abandon 
most of the methods of proofs used in [3] (particularly those in section 4 of [3]) 
and to develop the theory from the beginning. To make the present paper self- 
contained, the basic definitions already given in [3] are briefly restated in 
section 2.1. 

In [3] it is postulated (see Condition 3, p. 207) that the space © of all admissible 
distribution functions F is compact. In problems where the distribution func- 
tion F is known except for the values of a finite number of parameters, i.¢., where 
Q is a parametric class of distribution functions, the compactness condition will 
usually not be fulfilled if no restrictions are imposed on the possible values of the 
parameters. For example, if © is the class of all univariate normal distributions 
with unit variance, 2 is not compact. It is true that by restricting the parameter 
space to a bounded and closed subset of the unrestricted space, compactness of 
© will usually be attained. Since such a restriction of the parameter space can 
frequently be made in applied problems, the condition of compactness may not 
be too restrictive from the point of view of practical applications. Nevertheless, 
it seems highly desirable from the theoretical point of view to eliminate or to 
weaken the condition of compactness of 2. This is done in the present paper. 
The compactness condition is completely omitted in the discrete case (Theorems 
2.1-2.5), and replaced by the condition of separability of 2 in the continuous 
case (Theorems 3.1-3.4). The latter condition is fulfilled in most of the conven- 
tional statistical problems. 

Another restriction postulated in [3] (Condition 4, p. 297) is the continuity 
of the weight function W(F,d)in F. As explained in section 2.1 of the present 
paper, the value of W(F, d) is interpreted as the loss suffered when F happens to 
be the true distribution of the chance variables under consideration and the 
decision d is made by the statistician. While the assumption of continuity of 
W(F, d) in F may seem reasonable from the point of view of practical applica- 
tion, it is rather undesirable from the theoretical point of view for the following 


1 Work done under the sponsorship of the Office of Naval Research. 
165 








166 ABRAHAM WALD 


reasons. It is of considerable theoretical interest to consider simplified weight 
functions W(F, d) which can take only the values 0 and 1 (the value 0 corresponds 
to a correct decision, and the value 1 to a wrong decision). Frequently, such 
weight functions are necessarily discontinuous. Consider, for example, the 
problem of testing the hypothesis H that the mean @ of a normally distributed 
chance variable X with unit variance is equal to zero. Let d, denote the decision 
to accept H, and d, the decision to reject H. Assigning the value zero to the 
weight W whenever a correct decision is made, and the value 1 whenever a 
wrong decision is made, we have: 


W(6, di) = Ofor 6 = 0, and = 1 for 6 ¥ 0; W (6, d2) = Ofor 6 ¥ 0, 
and = 1 for 6 = 0. 


This weight function is obviously discontinuous. In the present paper the 
main results (Theorems 2.1-2.5 and Theorems 3.1-3.4) are obtained without 
making any continuity assumption regarding W(F, d). 

The restrictions imposed in the present paper on the cost function of experi- 
mentation are considerably weaker than those formulated in [3]. Condition 5 
[3, p. 297] concerning the class 2 of admissible distribution functions, and condi- 
tion 7 [3, p. 298] concerning the class of decision functions at the disposal of 
the statistician are omitted here altogether. 

One of the new results obtained here is the establishment of the existence 
of so called minimax solutions under rather weak conditions (Theorems 2.3 and 
3.2). This result is a simple consequence of two lemmas (Lemmas 2.4 and 3.3) 
which seem to be of interest in themselves. 

The present paper consists of three sections. In the first section several 
theorems are given concerning zero sum two person games which go somewhat 
beyond previously published results. The results in section 1 are then applied 
to statistical decision functions in sections 2 and 3. Section 2 treats the case of 
discrete chance variables, while section 3 deals with the continuous case. The 
two cases have been treated separately, since the author was not able to find 
any simple and convenient way of combining them into a single more general 
theory. 


1. Conditions for strict determinateness of a zero sum two person game. 
The normalized form of a zero sum two person game may be defined as follows 
(see [1, section 14.1]): there are two players and there is a bounded and real 
valued function K(a, b) of two variables a and b given where a may be any point 
of a space A and b may be any point of a space B. Player 1 chooses a point 
ain A and player 2 chooses a point b in B, each choice being made in complete 
ignorance of the other. Player 1 then gets the amount K(a, b) and player 2 the 
amount —K(a, b). Clearly, player 1 wishes to maximize K(a, b) and player 2 
wishes to minimize K(a, b). 

Any element a of A will be called a pure strategy of player 1, and any element 


STATISTICAL DECISION FUNCTIONS 167 


b of B a pure strategy of player 2. A mixed strategy of player 1 is defined as 
follows: instead of choosing a particular element a of A, player 1 chooses a 
probability measure £ defined over an additive class 9 of subsets of A and the 
point a is then selected by a chance mechanism constructed so that for any 
element a of 9% the probability that the selected element a will be contained in 
ais equal toé(a). Similarly, a mixed strategy of player 2 is given by a probabil- 
ity measure 7 defined over an additive class 8 of subsets of B and the element b 
is selected by a chance mechanism so that for any element 8 of S$ the probability 
that the selected element b will be contained in 8 is equal to n(8). The expected 
value of the outcome K(a, b) is then given by 


(1.1) K*( 0) = | [ K(a,0) ae an, 


We can now reinterpret the value of K(a, b) as the value of K*(&., m) where é, 

and m are probability measures which assign probability 1 to a and b, respec- 

tively. In what follows, we shall write K(é, n) for K*(é, n), K(a, b) will be used 

synonymously with K(é, &), K(a, ») synonymously with K(é , ») and K(é, b) 

synonymously with K(é, ). This can be done without any danger of confusion. 
A game is said to be strictly determined if 


(1.2) ~~ Inf K(é, ») = Inf Sup K¢é, n). 
” ” g 


The basic theorem proved by von Neumann [1] states that if A and B are 
finite the game is always strictly determined, i.e., (1.2) holds. In some previous 
publications (see [2] and [3]) the author has shown that (1.2) always holds if one 
of the spaces A and B is finite or compact in the sense of some intrinsic metric, 
but does not necessarily hold otherwise. A necessary and sufficient condition 
for the validity of (1.2) was given in [2] for spaces A and B with countably many 
elements. In this section we shall give sufficient conditions as well as necessary 
and sufficient conditions for the validity of (1.2) for arbitrary spaces A and B. 
These results will then be used in later sections. 

In what follows, for any subset a of A the symbol £, will denote a probability 
measure £in A for which (a) = 1. Similarly, for any subset 6 of B, & will stand 
for a probability measure 7 in B for which 7(8) = 1. We shall now prove the 
following lemma. 

Lemma 1.1. Let {a:} (@ = 1, 2,---, ad inf.) be a sequence of subsets of A 
such that a; © ai4, and let a = >-farai. Then 
(1.3) lim Sup Inf K(é.; , 7) = Sup Inf K(éa, 7). 

fa; 7 fa 1 


i=co £ 


Proor: Clearly, the limit of Sup Inf K(é.; , 7) exists as i — © and cannot 
7 


fa; 


exceed the value of the right hand member in (1.3). Put 


(1.4) lim Sup Inf K(éa;, 7) = p 


i= fa; 9 








168 ABRAHAM WALD 


and 


(1.5) Sup Inf K(é,7) =p +6 (5 > 0). 
Ea 7 
Suppose that 6 > 0. Then there exists a probability measure £% such that 


(1.6) K(t,2) 2 e+ 5 for all 7. 


Let £2, be the probability measure defined as follows: for any subset a* of a; 
we have 
0 

(a*) 
1.7) (at) = Sele 
; -_ £0 (ai) 





Then, since lim £2. (2 — ai) = 0, we have 

(1.8) lim K(éa; 57) = K(€a, 2) 
uniformly in 7. Hence, for sufficiently large 7, we have 
(1.9) Inf K(E2,,n) > 0 + 5, 


which is a contradiction to (1.4). Thus, 6 = 0 and Lemma 1.1 is proved. In- 
terchanging the role of the two players, we obtain the following lemma. 

LemMa 1.2. Let {8;} be a sequence of subsets of B such that 8; [ Bi41 and let 
Din Bs = B. Then 


(1.10) lim Inf Sup K(é, 73;) = Inf Sup K(Eé, na). 
13 g 


moo np, SE 


We shall now prove the following lemma. 
Lemma 1.3. The inequality” 


(1.11) Sup Inf K(é,) < Inf Sup K¢é, n) 
g ” ” g 


always holds. 


Proor: for any given e > 0, it is possible to find probability measures and 
7 such that 


(1.12) Inf Sup K(é, ») 2 Sup K(é, 7) — 
2 
and 
(1.13) Sup Inf K(é, ») S Inf K(@’, ») +. 
0 ” 


2 This inequality was given by v. Neumann [1] for finite spaces A and B. 


STATISTICAL DECISION FUNCTIONS 


Then we have, 


(1.14) Sup Inf K(é, ») < Inf K(@, n) +e < K(@, 0) + .€ 
2 7 
< Sup K(é, n) + € S Inf Sup Kfén )+2e. 
9 


Since e can be chosen arbitrarily small, Lemma 1.3 is proved. 
THEOREM 1.1. Jf ais a subset of A such that 


Sup Inf K(é.,) = Inf Sup K(éa, 7) 
fa 7 9 fa 

Inf Sup K(é, 7) = Inf Sup K(é, 7), 
7 Ea 9 


Sup Inf K(é, ») = Inf Sup K¢(é, »). 
g 0 7 g 


Proor: Clearly, 


(1.15) Sup Inf K(é., ») < Sup Inf K¢€é, 7) 
Ea 2 g ” 


and 
(1.16) Inf Sup K(é., 7) < Inf Sup K¢€é, 7). 
” Ea 7 E 
If the left hand members of (1.15) and (1.16) are equal to each other and 
equal to the right member of (1.16), then 
(1.17) Sup Inf K(é, 7) > Inf Sup K¢, 7). 
g n 1 E 
Because of Lemma 1.3 the equality sign must hold and Theorem 1.1 is proved. 


Interchanging the two players, we obtain from Theorem 1.1: 
THEOREM 1.2. If 8 isa subset of B such that Sup Inf K(é, ns) = Inf Sup K(é, n~) 
— 18 mp Og 


and Sup Inf K(é, ns) = Sup Inf K(é, n), 
g 1B — 4 


then 
Sup Inf K(é, 7) = Inf Sup K¢é, 7). 
g 0 n g 
We shall now prove the following theorem. 
THEOREM 1.3. If {ai} is a sequence of subsets of A such that aj © aiz: and 
> a; = A, and if 


t=1 


(1.18) Sup Inf K(é; , 7) = Inf Sup K(é.; , 7) 
fa; 1 0 


“s 








170 ABRAHAM WALD 


for each 12, then a necessary and sufficient condition for the validity of 


(1.19) Sup Inf A(é, 7) = Inf Sup K(é, 7) 
g ” 2 § 
as that 
(1.20) lim Inf Sup A(é;,) = Inf Sup K(é, 7). 
i=o 80n fa; n g 


Proor: Because of (1.18) and Lemma 1.1 we have 


(1.21) lim Inf Sup K(fa;, 7) = Sup Inf K(é, 7). 
E E 0 


imoo gn fa; 


Hence, (1.20) implies (1.19) and (1.19) implies (1.20). This proves Theorem 1.3. 
Interchanging the role of the two players, we obtain from Theorem 1.3 the 
following theorem. 
THEOREM 1.4. Jf {8;:} is a sequence of subsets of B such that By; © Biz: and 


>» Bi = 8, and if 
i=1 


Sup Inf K(é, na;) = Inf Sup K(é, na,), 
— 78; mp; & 


then a necessary and sufficient condition for the validity of (1.19) ts that 
(1.22) lim Sup Inf K(é, n2;) = Sup Inf K(é, 7). 
n 


t= 1B; g 
In [3] an intrinsic metric was introduced in the spaces A and B. The distance 
of two elements a; and a of A is defined by 


(1.23) 6(a1 , @2) = Sup | K(a,, b) — K(m,b) |. 
b 


Similarly, the distance between two points b; and b of B is defined by 
(1.24) 5(b; ’ b2) = Sup | K(a, ay) cy K(a, be) | ° 


Suppose that there exists a sequence {a;} of subsets of A such that a, is con- 
eo 
ditionally compact, a; © aiy: and > a: = A? It was shown in [3] that for 
i=1 
any conditionally compact subset a; the relation (1.18) holds. Hence, according 
to Theorem 1.3, a necessary and sufficient condition for the validity of (1.19) 
is that (1.20) holds for a sequence {a;:} where a; is conditionally compact, 
oO 
a; Cajs,and >, a; = A. Similar remarks can be made concerning the space B. 
i=1 
The distance definitions given in (1.23) and (1.24) can be extended to the spaces 
of the probability measures £ and 7, respectively. That is, 


(1.25) 6(&, &) = Sup | K(, 7) — KA(&, 0) | 
” 


3 Fora definition of compact and conditionally compact sets, see F. Hausdorff, Mengenlehre 
(8rd edition), p. 107, or [3, p. 296]. 





STATISTICAL DECISION FUNCTIONS 171 


and 


(1.26) d(m, m) = ai | K(E, m) — K(E, m) |. 


We shall say that a probability measure £ is discrete if there exists a denumer- 
able subset a of A such that &(a) = 1. Similarly, a probability measure 7 will 
be said to be discrete if 7(@) = 1 for some denumerable subset 8 of B. We shall 
now prove the following theorem. 

THEOREM 1.5. If the choice of player 1 is restricted to elements of a class C of 
probability measures ~ in which the class of all discrete probability measures & is 
dense, then a necessary and sufficient condition for the game to be strictly determined 
is that there exists a sequence {a;} of elements of A such that 


(1.27) lim Inf Sup K(é.;, 7) = Inf Sup K¢é, n) 
t= 7 fa; n g 
where 
Qa; = {di , Qe, ee , a;}. 

Proor: Since the class of all discrete probability measures £ lies dense in the 
class C, there exists a sequence a = {a,;} (¢ = 1, 2,--- , ad inf.) 
such that 
(1.28) Sup Inf K(é., 7) = Sup Inf K¢é, 7). 

&a 7 E 7 
Since a: = {a,,--- , a;} is finite, we have 
(1.29) Inf Sup K(éa; , 7) = Sup Inf K(é.; , 7). 
2 fa; fa; n 


It then follows from Lemma 1.1 that 
(1.30) lim Inf Sup K(é.,;,7) = Sup Inf K(é, 7) = Sup Inf K(é, 7). 
to og Ea; ga ” g 9 


Clearly, (1.30) and strict determinateness of the game implies (1.27). On the 
other hand, any a = {a;} that satisfies (1.27), will satisfy also (1.28) and (1.30). 
But (1.27) and (1.30) imply that the game is strictly determined. Thus, 
Theorem 1.5 is proved. 

THEOREM 1.6. [Jf the choice of player 2 is restricted to elements of a class C of 
probability measure in which the class of all discrete probability measures n lies 
dense, then a necessary and sufficient condition for the strict determinateness of the 
game is that there exists a sequence 8 = {b;} of elements of B such that 
(1.31) lim Sup Inf K(é, 3;) = Sup Inf K(é, ») 

1B; ” 


to 86 
where 
6B; = {bi , pe, » bs}. 


This theorem is obtained from Theorem 1.5 by interchanging the players 
1 and 2. 








172 ABRAHAM WALD 


2. Statistical decision functions: the case of discrete chance variable. 

2.1. The problem of statistical decisions and its interpretation as a zero sum two 
person game. In some previous publications (see, for example, [3]) the author 
has formulated the problem of statistical decisions as follows: Let X = {X°} 
(¢ = 1,2, --- , ad inf.) be an infinite sequence of chance variables. Any particu- 
lar observation x on X is given by a sequence x = ‘x'} of real values where 2’ 
denotes the observed value of X*. Suppose that the probability distribution 
F(x) of X is not known. It is, however, known that F is an element of a given 
class 2 of distribution functions. There is, furthermore, a space D given whose 
elements d represent the possible decisions that can be made in the problem 
under consideration. Usually each element d of D will be associated with a 
certain subset w of 2 and making the decision d can be interpreted as accepting 
the hypothesis that the true distribution is included in the subset w. The funda- 
mental problem in statistics is to give a rule for making a decision, that is, a 
rule for selecting a particular element d of D on the basis of the observed sample 
point x. In other words, the problem is to construct a function d(x), called 
decision function, which associates with each sample point x an element d(x) 
of D so that the decision d(x) is made when the sample point x is observed. 

This formulation of the problem includes the sequential as well as the classical 
non-sequential case. For any sample point x, let n(x) be the number of com- 
ponents of x that must be known to be able to determine the value of d(x). In 
other words, n(x) is the smallest positive integer such that d(y) = d(x) for any y 
whose first n coordinates are equal to the first n coordinates of x. If no finite 
n exists with the above property, we put n = ». Clearly, n(x) is the number 
of observations needed to reach a decision. ‘To put in evidence the dependence 
of n(x) on the decision rule used, we shall occasionally write n(x; D) instead of 
n(x) where D denotes the decision function d(x) used. If n(x) is constant over 
the whole sample space, we have the classical case, that is the case where a 
decision is to be made on the basis of a predetermined number of observations. 
If n(x) is not constant over the sample space, we have the sequential case. A 
basic question in statistics is this: What decision function should be chosen by 
the statistician in any given problem? Toset up principles for a proper choice of 
a decision function, it is necessary to express in some way the degree of im- 
portance of the various wrong decisions that can be made in the problem under 
consideration. This may be expressed by a non-negative function W(F, d), 
called weight functions, which is defined for all elements F of 2 and all elements 
d of D. For any pair (I, d), the value W(F, d) expresses the loss caused by 
making the decision d when F is the true distribution of XY. For any positive 
integer n, let c(n) denote the cost of making n observations. If the decision 
function D = d(x) is used the expected loss plus the expected cost of experi- 
mentation is given by 


(2.1) iF, D] = i WIF, d(z)] dF(x) + [ e(n(x)) dF (x) 


rr 


ee — ee 


> 
' 


— — OO & 


STATISTICAL DECISION FUNCTIONS 173 


where M denotes the sample space, i.e. the totality of all sample points x. We 
shall use the symbol D for d(x) when we want to indicate that we mean the whole 
decision function and not merely a value of d(x) coresponding to some z. 

The above expression (2.1) is called the risk. Thus, the risk is a real valued 
non-negative function of two variables F and D where F may be any element 
of 2 and D any decision rule that may be adopted by the statistician. 

Of course, the statistician would like to make the risk r as small as possible. 
The difficulty he faces in this connection is that r depends on two arguments F 
and D, and he can merely choose D but not F. The true distribution F is chosen, 
we may say, by Nature and Nature’s choice is usually entirely unknown to the 


statistician. Thus, the situation that arises here is very similar to that of a 


zero sum two person game. 


As a matter of fact, the statistical problem may be 


interpreted as a zero sum two person game by setting up the following corres- 


pondence: 


Two Person Game 


Player 1 
Player 2 
Pure strategy a of player 1 
Pure strategy b of player 2 
Space A 
Space B 


Outcome K(a, b) 
Mixed strategy £& of 
player 1 


Mixed strategy 7 of 
player 2 


Outcome K(é, 7) when 
mixed strategies are 
used. 


Statistical Decision Problem 


Nature 

Statistician 

Choice of true distribution F by Nature 

Choice of decision rule D = d(x) 

Space Q 

Space Q of decision rules D that can be used by 
the statistician. 

Risk r(F, D) 

Probability measure ~ defined over an additive 
class of subsets of 2 (a priori probability dis- 
tribution in the space {) 

Probability measure y defined over an additive 
class of subsets of the space Q. We shall refer 
to » as randomized decision function. 


Risk r(é, ») = [ [ r(F,®) dé dn. 


2.2. Formulation of some conditions concerning the spaces Q, D, the weight func- 
tion W(F, d) and the cost function of experimentation. A general theory of statis- 
tical decision functions was developed in [3] assuming the fulfillment of seven 
conditions listed on pp. 297-8.4 The conditions listed there are unnecessarily 
restrictive and we shall replace them here by a considerably weaker set of con- 


ditions. 


In this chapter we shall restrict ourselves to the study of the case where each 
of the chance variables X’, X°, --- ,ad inf. isdiscrete. Weshall say that a chance 


4In [3] only the continuous case is treated (existence of a density function is assumed), 
but all the results obtained there can be extended without difficulty to the discrete case. 








174 ABRAHAM WALD 


variable is discrete if it can take only countably many different values. Let 
Gi, @i2, --* , ad inf. denote the possible values of the chance variable X*. Since 
it is immaterial how the values a;; are labeled, there is no loss of generality in 
putting ai; = J = 1, 2, 3,---, ad inf.). Thus, we formulate the following 
condition. 

ConpDITION 2.1. The chance variable X* (i = 1, 2, «++ , ad inf.) can take only 
positive integral values. 

As in [3], also here we postulate the boundedness of the weight function, i.e., 
we formulate the following condition. 

ConpDITION 2.2. The weight function W(F, d) is a bounded function of F and d. 

To formulate condition 2.2, we shall introduce some definitions. Let w be a 
given subset of 2. The distance between two elements d; and d2 of D relative to 
w is defined by 
(2.2) 6(di , d2;w) = Sup| W(F, di) — W(F, d2) |. 

Few 

We shall refer to 6(d; , dz ; 2) as the absolute distance, or more briefly, the dis- 
tance between d; and d:. We shall say that a subset D* of D is compact (con- 
ditionally compact) relative to w, if it is compact (conditionally compact) in 
the sense of the metric 6(d, , dz; w). If D* is compact relative to Q, we shall 
say briefly that D* is compact. 

An element d of D is said to be uniformly better than the element d’ of D rela- 
tive to a subset w of © if 


W(F,d) s= W(PF, 2’) for all F in w 
and if 
W(F, d) < W(F, a’) for at least one F in w. 


A subset D* of D is said to be complete relative to a subset w of @ if for any d 
outside D* there exists an element d* in D* such that d* is uniformly better than 
d relative to w. 

ConDITION 2.3. For any positive integer i and for any positive e there exists a 
subset D3. of D which is compact relative to 2 and complete relative to w;,. where 
w;,¢ 18 the class of all elements F of 2 for which prob {X* S i} = «. 

If D is compact, then it is compact with respect to any subset w of 2 and Con- 
dition 2.3 is fulfilled. For aay finite space D, Condition 2.3 is obviously ful- 
filled. Thus, Condition 2.3 is fulfilled, for example, for any problem of testing 
a statistical hypothesis H, since in that case the space D contains only two ele- 
ments d; and dz where d; denotes the decision to reject H and d, the decision to 
accept H. 

In [3] it was assumed that the cost of experimentation depends only on the 
number of observations made. This assumption is unnecessarily restrictive. 
The cost may depend also on the decision rule D used. For example, let D, 
and Ds be two decision rules such that n(a; D;) is equal to a constant m» , while 





Let 
ince 
y in 
ring 


nly 


1d. 
ea 
> to 


lis- 
on- 


all 


la- 


yd 
an 


STATISTICAL DECISION FUNCTIONS 175 


>, is such that at any stage of the experimentation where D2 requires taking at 
least one additional observation the probability is positive that experimentation 
will be terminated by taking only one more observation. Let z’ be a particular 
sample point for which n(z°; D2) = n(x°, Dy) = no. There are undoubtedly 
cases where the cost of experimentation is appreciably increased by the necessity 
of having to look at the observations at each stage of the experiment before we 
can decide whether or not to continue taking additional observations. Thus 
in many cases the cost of experimentation when 2° is observed may be greater 
for D2 than for D,. The cost may also depend on the actual values of the ob- 
servations made. Thus, we shall assume that the cost c is a single valued func- 
tion of the observations z',--- , 2” and the decision rule D used, ie., c = 
e(z',---, x”, D). 

Conpition 2.4. The cost c(z', ---, 2", D) is non-negative and lim 
e(z', --: , 2, D) = & uniformly in z',---,2",Dasm— «. For each pos- 
itive integral value m, there exists a finite value cn, depending only on m, such 
that c(z', ---, x”, D) < Cm identically in a, ---, 2", D. Furthermore, 
c(t’, --- , 2", Di) = c(z’, «+: , 2”, De) ‘f a Dy) = sin >) oe all ze ole 
for any sample point x we ons td, ; ” Di) < e(z’, ’ ®») >.) 
if there exists a positive integer m such that ni D.) = n(x, De) aie a D2) << m 
and n(x, Di) = m when n(x, D2) = m. 

2.3 Alternative definition of a randomized decision function, and a further con- 
dition on the cost function. In Section 2.1 we defined a randomized decision 
function as a probability measure 7 defined over some additive class of subsets 
of the space Q of all decision functions d(x). Before formulating an alternative 
definition of a randomized decision function, we have to make precise the mean- 
ing of 7 by stating the additive class Cg of subsets of Q over which 7 is defined. 
Let Cp be the smallest additive class of subsets of D which contains all subsets 
of D which are open in the sense of the metric 5(d; , d2 ;2). For any finite set of 
positive integers a, , --- , a, and for any element D* of Cp, let Q(a,---, a, 
D*) be the set of all decision functions d(x) which satisfy the following two con- 
ditions: (1) Ifa’ = a, ,a° =a@,--: ,2° =a, thenn(x) =k; (2) Ifz =a,---, 
a’ = a, then d(x) is an element of D*. Let C’S be the class of all sets Q(a: , 

, a, , D*) corresponding to all possible values of /:, a; , --- , a, and all pos- 
sible elements D* of Cp. The additive class Cg is defined as the smallest 
additive class containing C9 as a subclass. Then with any 7 we can associate 
two sequences of functions 

{2n(z', ran =" n)} 
and 
{5.1...2m(D* | »)}(m = 1, 2, --+ , ad inf.) 
where 0 < zm (a’, +--+, 2"/n) S 1 and for any 2’, --+ , 2”, 6.1...» is a prob- 


ability measure in D defined over the additive class Cp. Here 


m | 


tule, eee ge n) 







































176 ABRAHAM WALD 


denotes the conditional probability that n(x) > m under the condition that 
the first m observations are equal to x’, --- , 2” and experimentation has not 
been terminated for (2', --- , x‘) for (k = 1, 2,--- , m — 1), while 


5ztsccem(D* | 7) 


is the conditional probability that the final decision d will be an element of D* 


under the condition that the sample (2’, --- , x”) is observed and n(x) = m. 
Thus 

ai(x" | m)zo(a', 2 | n) +++ 2ma(z, +>, 2" | m)[L — 2m(z', ++, 2" | a) = 
(2.3) 


nlQ(z’, ve . D)\ 





and 
(2.4) Set.can(D* |) = MQ@s +++» 2", DY] 
: nlQ(@, --- , 2, D)| 
We shall now consider two sequences of functions {2m(z', ++, 2™)} and 


{621...2m(D*)}, not necessarily generated by a given 7. An alternative definition 
of a randomized decision function can be given in terms of these two sequences 
as follows: After the first observation x’ has been drawn, the ‘statistician deter- 
mines whether or not experimentation be continued by a chance mechanism 
constructed so that the probability of continuing experimentation is equal to 
z(x'). If it is decided to terminate experimentation, the statistician uses a 
chance mechanism to select the final decision d constructed so that the prob- 
ability distribution of the selected d is equal to 6:(D*). If it is decided to take 
a second observation and the value 2” is obtained, again a chance mechanism is 
used to determine whether or not to stop experimentation such that the prob- 
ability of taking a third observation is equal to z:(z', 2°). If it is decided to stop 
experimentation, a chance mechanism is used to select the final d so that the 
probability distribution of the selected d is equal to 6,:.2(D*), and so on. 

We shall denote by ¢ a randomized decision function defined in terms of two 
sequences {Zm(x',---, 2™)} and {6.1...2m(D*)}, as described above. Clearly, 
any given 7 generates a particular ¢. Let ¢(n) denote the ¢ generated by 7. 
One can easily verify that two different y's may generate the same ¢, i.e., there 
exist two different 7's, say m and 72 such that ¢(m) = (m2). 

We shall now show that for any ¢ there exists an 7 such that ¢(n) = ¢. Let 
¢ be given by the two sequences {zm(z',--- , 2”)} and {5z1...2m(D*)}. Let b; 
denote a sequence of r; positive integers, i.e.,b; = (ba, --- , bjr;) G = 1,2, --+ ,*) 
subject to the restriction that no b; is equal to an initial segment of bi(j ¥ 1). 
Let, furthermore, Di,---, De be k elements of Cp. Finally, let Qi, ---, 
b., Di, --- , Dz) denote the class of all decision functions d(x) which satisfy 





hat 
not 


und 
ion 
ces 
ter- 
ism 

to 
sa 
ob- 
ike 


1 is 


STATISTICAL DECISION FUNCTIONS 177 


the following condition: If (2', --- , 2”/) = b; then n(x) = r; and d(z) is an ele- 
ment of D}(j = 1,---,k). Let » be a probability measure such that 


nlQ(bi, «++, be, Dt , ++ , Ded] 
(2.5) = (Dr) --- §(De) TT IT IT --- TI 


=] 2%] gril} zim] 

fam(a', ++, 2°" — aula, ++ , eae} 
holds for all values of k, bi, ---,bk, DI, ---, Dz. Here Qn(z',-++, 2") = 
1 if (2’, --- , 2”) is equal to an initial segment of at least of one of the samples 
bi, -:: , be, but is not equal to any of the samples b;,--- , be. In all other 
cases gm(x',---, x") = 0. The function g*,(2',---, 2”) is equal to 1 if 
(x', ---, 2”) is equal to one of the samples b;, --- , b,, and zero otherwise. 
Clearly, for any » which satisfies (2.5) we have ¢(7) = ¢. The existence of such an 
n can be shown as follows. With any finite set of positive integers 7, --- , 7; 
we associate an elementary event, say A,(i:,---, 7). Let A-(i,---, 7,) 
denote the negation of the event A,(i; , --- ,7,).. Thus, we have a denumerable 
system of elementary events by letting r, 7;,--- , 7, take any positive integral 
values. We shall assume that the events A:(1), Ai(2),--- , ad inf. are inde- 
pendent and the probability that A,(z¢) happens is equal 2(7). We shall now 
define the conditional probability of A2(z, 7) knowing for any k whether A;(k) 
or Ai(k) happened. If A,(z) happened, the conditional probability of Ao(i,j) = 
ze(i, 7) and O otherwise. The conditional probability of the joint event that 
Aa(ti , jr), A2(te , Jo), -++ , Act, Jr), Ac(traa, Jraa), +°* , AMA Ac(tr4s, Jrte) will 
happen is the product of the conditional probabilities of each of these events 
(knowing for each i whether Aj(i) or A;(i) happened). Similarly, the condi- 
tional probability (knowing for any 7 and for any (7, 7), whether the correspond- 
ing event A2(z, 7) happened or not) that A3(t, ji, ki) and As(i2, je, ke) and 

- As(t,, jr, Kr) and As(ir41 > Jrtty Key) and --- and A3(ir483 r+) Krys) will 
simultaneously happen is equal to the product of the conditional probabilities 
of each of them. The conditional probability of A3(z, 7, k) is equal to 2(2, 7, k) 
if A,(z) and A2(i, 7) happened, and zero otherwise; and so on. Clearly, this 
system of probabilities is consistent. 

If we interpret A,(i:,--- , 7,) as the event that the decision function D = 
d(x) selected by the statistician has the property that n(x; D) > r when x’ = 
a,;, °°: ,2 = 7,, the above defined system of probabilities for the denumerable 
sequence {A,(i,,---, i,)} of events implies the validity of (2.5) for Dj = 
D(Gj = 1, ---, k). The consistency of the formula (2.5) for D; = D implies, 
as can easily be verified, the consistency of (2.5) also in the general case when 
Dp # D. 

Let ¢; be given by the sequences of {Zm:(z',--: , 2”)} and {8z1...2m,;} (m = 
1, 2,---, ad inf.). Let, furthermore, ¢ be given by {zm(z',---, 2”)} and 
{5,1...2m}. We shall say that 
(2.6) lim ¢; = ¢ 


t=0o 











178 ABRAHAM WALD 


if for any m, x’, --- , 2” we have | 
(2.7) lim 2mi(t', «++ 52") = 2n(a', +++, 2) | 
in 
and 
(2.8) lim 6,1...2m,i(D*) = 6:1...2m(D*) 
imc 


for any open subset D* of D whose boundary has probability measure zero ac- 
cording to the limit probability measure 6,1...2m 

In addition to Condition 2.4, we shall impose the following continuity con- 
dition on the cost function. 

ConpDITION 2.5. If 


lim $(n:) = $(n), 


then 

lim e(z’, on =", D) dni = / e(z’, ‘oe. 2”, D) dn. 

_ Q(zl,--+,2™) Q(zl,-+-,2™) 
where Q(x’, --+ , x”) is the class of all decision functions D for which n(y, D) = 
mify' =2,-++,y™ = 2". 


2.4. The main theorem. In this section we shall show that the statistical 
decision problem, viewed as a zero sum two person game, is strictly determined. 
It will be shown in subsequent sections that this basic theorem has many im- 
portant consequences for the theory of statistical decision functions. A precise 
formulation of the theorem is as follows: 

THEOREM 2.1. Jf Conditions 2.1-2.5 are fulfilled, the decision problem, viewed 
as a zero sum two person game, is strictly determined, i.e., 


(2.9) a me r(é, 7) = - ~~ r(é, 7). 
To prove the above theorem, we shall first derive several lemmas. 


LemMA 2.1. For any « > 0, there exists a positive integer m, , depending only 
on e, such that the value of S - Inf r (&, ), 1s not changed by more than « if we re- 


strict the choice of the statistician to decision functions d(x) for which n(x) S m 


for all x. 
Proor: Put Wo. = Sup W(F, d) and choose m, so that 
F,D 
m Ws 
(2.10) e(z', ---, 2", D) > — 
€ 
identically in 2’, --- , 2” and D forall m = m,. The existence of such a value 


m, follows from Condition 2.4. Consider the function Inf r(é,D). Our lemma 


is proved, if we can show that for any &, the value of Inf r(é, D) is not increased 


3 
a 





G- 


i- 


le 
‘aA, 


STATISTICAL DECISION FUNCTIONS 179 


by more than e if we restrict D to be such that n(z,D) < m.forallz. The latter 
statement is proved, if we can show that for any decision function D,; = d;(z) 
we can find another decision function D. = d2(x) such that n(x, D2) < m, for 
all x and r(é, D2) < r(&, D1) + «. There are two cases to be considered: (a) 
prob {n(X, Di) > m.| &} 2 €/Wo and (b) prob {n(X, Di) > m|£} < €/Wo. 
In case (a) we have r(é,D;) 2 Wo. In this case we can choose D> to be the rule 
that we decide for some element d) of D without taking any observations. 
Clearly, for this choice of D2 we shall have r(é, D2) < r(é, D1). In case (b) 
we choose ©2 as follows: 


d(x) = d(x) whenever n(x, D1) S m,; 
d2(x) = dy whenever n(z, D1) > m, 


where dp is an arbitrary element of D. Thus, n(z, D2) S m, for all x. Since 


prob {n(x, Di) > m.| E) < €/Wo, it is clear that r(~, De) S r(—, Di) + «. Hence 
our lemma is proved. 

Let Q” denote the class of decision functions D for which n(x; D) S m for 
all x. For any positive e, let Q”"* denote the class of all decision functions 
which satisfy the following two conditions simultaneously: (1) n(z,®) < m for 
all x; (2) d(x) is an element of Di. where Di, denotes the subset of D having 
the properties stated in Condition 2.3. Clearly, Q”* C Q”. A probability 
measure 7 will be denoted by ” if 7(Q”) = 1, and by n”" if n(Q”") = 1. 

Lemma 2.2. The following inequality holds: 


(2.11) Sup Inf r(é, n”) S Sup Inf r(é, 7‘) S Sup Inf r(é, 7”) + € Wo, 
E “™ —& ™<e = ym 


where Wo is an upper bound of W(F, d). 

Proor: The first half of (2.11) is obvious. If we replace the subscript x’ by 
the chance variable X’, the set w,1,. defined in Condition 2.3 will be a random 
subset of 2. It follows easily from the definition of wa, that 


(2.12) prob {Fewr,.|F} = 1—«. 


With any decision function D = d(x) we shall associate another decision func- 
tion D* = d*(x) such that n(x, D) = n(x, D*); d*(x) = d(x) whenever d(z) e 
Dn..; and d*(x) is an element of Dn. that is uniformly better than d(x) 
relative to w,1,e Whenever d(z)¢ Dn... It follows from (2.12) and the fact that 

W, is an upper bound of W(F, d) that 


(2.13) r(F, D*) Ss r(F,D) + € Wo. 
The second half of (2.11) is an immediate consequence of (2.13) and our lemma 
is proved. 
LemMa 2.3. The equation 
(2.14) Sup Inf r(é, 7”""*) = Inf Sup r(é, 7”") 
g g™e § 


a” 


holds for all m and e. 








180 ABRAHAM WALD 


Proor: For any positive integral values m, k and for any p > 0, let 2"""” be 
the class of all elements F of 2 for which 


prob {z' Skanda’ S kand---2" Sk} >1—op. 


A probability measure ~ for which £(2"""”) = 1 will be denoted by ¢”™"". To 
prove (2.14), we shall first prove the inequality 
(2.15) | Sup Inf r(e""*"", »™*) — Inf Sup rent? on™) | S p(Wo + Cn) 
mEP le a%e Em 

where C'm is an upper bound of C(z', --- , 2’, D)forallr < m,z’,--- ,2 and®. 

Since for any d(x) in Q”", d(x) must be an element of Dn, and since Dn. 
is compact, it is sufficient to prove the validity of (2.14) in the case when Dn. 
is a finite set. Thus, we shall assume in the remainder of the proof that Dn. 
is finite. 

Let 6 be a given positive number and let Q””’* be a finite subset of Q”"* satis- 
fying the following condition: for any element D = d(z) in Q”" there exists an 
element D* = d*(zx) in Q”""* such that 


d*(x) = d(x) and | C(a, D*) — C(x, D)| S 4 


for all x for which x < k, 2”? S$ k,---,and2” <k. Clearly, for any choice of 
6 there exists a finite subset Q”"'* of Q”"* with the desired property. For any 
D in Q”"*, we can then find an element D* in Q”“* such that 


r(F, D*) S r(F, D) + p(Wo + Cm) + 4, 
for all F in 2”**, From this it follows that 
(2.16) one Inf r <= Sup Infrs Sup Inf r + p(Wo + Cn) +6 
Em,k,p nM~¢ 


kp nM, Em, kp ym,kye 


(2.17) Inf Sup r < Inf Sup r S Inf Sup r+ p(Wo + Cr) + 5 
nme §m,k,p nmkie ¢m,k,p n™e ¢m,k,p 


; : ede, gine 
where 7”""*(Q""*"*) = 1. Since Q”""‘ is finite, we have 


(2.18) Sup Inf r = Inf Supr. 
¢m,k,p n™,kye nm ke gmk, 
Inequality (2.15) follows from (2.16), (2.17) and (2.18) and the fact that 6 
can be chosen arbitrarily small. 
Lemmas 1.1., 1.3 and the inequality (2.15) imply that Lemma 2.3 must hold 


if 
(2.19) lim Inf Sup r = Inf Sup r 
k=oo n™,¢€ Em,k,p nme 
holds. Thus, the proof of Lemma 2.3 is completed if we can show the validity 
of (2.19). 
Let {n-'*} (& = 1, 2, --- , ad inf.) be a sequence of randomized decision func- 


tions such that 
(2.20) lim [Sup r(en""?, nf'*) — Inf Sup r(e"*?, n™**)] = 0. 
are Lene 


eu 


STATISTICAL DECISION FUNCTIONS 181 


Let ¢% = £(m''‘) (see definition in Section 3.2) and let ¢ be given by the two 


sequences of functions {2,(z',---, 2)} and {dz1...2r%} (7 = 1, 2,---, m). 
Since there are only countably many samples (z’, --- , 2") (r S m), there exists 
a subsequence {k'} of the sequence {k} such that 
(2.21) lim Zr ei(a', =e x’) — 2-(2', arr ae x’) 

k= 
and 
(2.22) lim 621...27,41 = Ozt22...2r 

k=00 

for all r and all samples (2, --- , 2"). Let 70° be a randomized decision func- 


tion such that ¢(no"*) is equal to the ¢ defined by {z,(z’,--- , 2")} and {8,1...2r} 
(r = 1, 2,---, m). 

For any element F of © and for any v > 0, there exists a finite subset M, of 
the m-dimensional sample space such that the probability (under F) that the 


sample (x', --- , 2”) will fall in M,is = 1 — ». From this and the continuity 
of the cost function (Condition 2.5) it follows that 
(2.23) lim r(F, ni‘) = r(F, no“) for all F. 

k=moo 
Clearly, 
(2.24) Sup re"? n) = Sup r(F™*?, ») 

¢m,k,p Fm,k,p 
where F”“? is an element of 2”. Hence 
(2.25) Inf Sup r(é"""", 7""*) = Inf Sup r(F"™*", n™**). 

n™e ¢m,k,p n™e Frm,k,p 


Since any F in @ is contained in 2” for sufficiently large k, it follows from (2.20) 
and (2.25) that 


(2.26) lim r(F nii* < lim {Inf Sup r(F n™*)}. 
k==co 


=o | on ™,¢ PM,k 


Hence, because of (2.23), 


(2.27) r(F, ao") S$ lim {Inf Sup r(F™"", 9”")}. 
n™e ¥™ 
Thus, 
(2.28) _ Sup r(F, n *) =< iim { Inf Sup r(F™ fe ? “™ *)), 
Zz e 


Since the left hand member of (2.28) cannot be smaller than the right hand 
member, the equality sigh must hold. This concludes the proof of Lemma 2.3. 

Theorem 2.1 can easily be proved with the help of lemmas 2.1, 2.2 and 2.3. 
From Lemma 2.2 it follows that 


(2.29) lim Sup Inf r= Sup Inf r. 


e=0 











182 ABRAHAM WALD i 


From this and Lemma 2.3 we obtain 


(2.30) lim Inf Sup r = Sup Inf r. 
e=0 ne — 9” 
But 
lim Inf Sup r 2 Inf Sup r. 
c=0 n™e = § 
Hence 
(2.31) Inf Sup r < Sup Inf r. 
™ & "™ 
Hence, because of Lemma 1.3, we then must have 
(2.32) Sup Inf r = Inf Sup r. 
tq ” 


It follows from Lemma 2.1 that 


(2.33) lim Sup Inf r = Sup Inf r. 
™m=00 gE ™ g n 

Hence, because of (2.32), we have 

(2.34) lim Inf Sup r = Sup Inf r. 
m= 7m é gE ” 

But 

(2.35) lim Inf Sup r = Inf Sup r. 
mmo qm n € 

Hence 

(2.36) Inf Sup r S Sup Inf r 
n g g ” 


Theorem 2.1 is an immediate consequence of (2.36) and Lemma 1.3. 

2.5. Theorems on complete classes of decision functions and minimax solutions. 
For any positive e we shall say that the randomized decision function 7 is an 
e-Bayes solution relative to the a priori distribution é if 


(2.37) r(é, m) S Inf r(é, n) + «. 
n 

If mo satisfies (2.37) for «e = 0, ‘Wwe shall say that is a Bayes solution relative 
to &. 

A randomized decision rule m is said to be uniformly better than 7p if 
(2.38) r(F, m) S r(F, m) for all F 
and if 
(2.39) r(F,m) < r(F, nz) at least for one F. 


A class C of randomized decision functions 7 is said to be complete if for any 
7 not in C' we can find an element 7* in C such that 7* is uniformly better than 7. 





ns. 
an 


ny 


STATISTICAL DECISION FUNCTIONS 183 


THEOREM 2.2. If Conditions 2.1-2.5 are fulfilled, then for any « > 0 the class 
C, of all «Bayes solutions corresponding to all possible a priori distributions & 
is a complete class. 


Proor: Let m be a randomized decision function that is not an e-Bayes solu- 
tion relative to any & That is, 


(2.40) r(é, m) > Inf r(é, n) + e for all &. 


If r(F, ») = © for all F, then there is evidently an element of C, that is uni- 
formly better than m). Thus, we can restrict ourselves to the case where 


(2.41) r(F, mm) < © at least for one F. 
Put 
(2.42) W*(F, d) = W(F, d) — r(F, 2) 
and let r* (é, 7) denote the risk when W(F, d) is replaced by W*(F, d). Then 
(2.43) r*(E, 0) = r(é, 0) — r(é, 1). 


Let Q” denote the class of all decision functions d(x) for which n(x) S m 
identically in x. Furthermore, denote any 7 for which 7(Q”) = 1 by n”. We 
shall first prove the following relation. 


(2.44) Sup Inf r*(é, 7”) = Inf Sup r*(é, 7”) 
as aed g 


for any positive integral value m. For any positive constant c, let 2, denote the 
class of all elements F for which r(F, 0) S c. 

Clearly, Conditions 2.1-2.5 remain valid if we replace W(F, d) by W*(F, d) 
and Q by Q, where c is restricted to values for which Q, is not empty. Hence, 
Theorem 2.1 can be applied and we obtain 


(2.45) Sup Inf r*(é, »”) = Inf Sup r*(e,n”), 
ce _™ 7™ c 


where & denotes any é for which ¢(Q2.) = 1. Let h and w be two positive values 
for which 


(2.46) Sup Inf r*(€°, 7") = —h forall c 
: = 

and 

(2.47) r(F, n”) S w for all F and all n”. 


Clearly, such two constants h and w exist. From (2.46) and Lemma 1.3 we ob- 
tain 


(2.48) Inf Sup r* (é, 7”) = —h. 
n™ § 
Since 
(2.49) r*(F, 1”) < —(h + 6) for any F not in 2,434(6 > 0), 








184 ABRAHAM WALD 


it follows from (2.48) that 


(2.50) Inf Sup r* = Inf Sup 7* forall c>h+w. 
nm ge n™ 

From (2.45) and (2.50) we obtain 

(2.51) Sup Inf r* = Inf Sup r* forall ¢c>h+w. 
go qm n™ 

Hence, 

(2.51a) Sup Inf r* = Inf Sup -*. 

pod n@™ ™ gE 


Because of Lemma 1.3, the equality sign must hold and (2.44) is proved. 
Since 7 is not an element of C, , we must have 


(2.52) Inf r(é, ») < r(&, m0) — «€. 
7 
From this it follows that 
(2.53) Int r*(é, 7) S —e. 
n 
Hence 
(2.54) Sup Inf r*(é, 7) S —e. 
g ” 


It was shown in the proof of Lemma 2.1 that for any p > 0 there exists a 
positive integer m, , depending only on p, such that 


(2.55) Inf r(é, 7”’) S Inf r(é, n) + p for all &. 
n™p ” 
From (2.44), (2.54) and (2.55) it follows that there exists a positive integer 
mp, namely m = m2, such that 


(2.56) Inf Sup r*(é, 7”) < -5 for any m = Mm. 
a 6 
From (2.44) and (2.56) it follows that there exists an a priori distribution é; 
and an e-Bayes solution 77 relative to & such that 
(2.57) mF, a) s- 5 for all F. 


Hence, because of (2.43), 


(2.58) r(F, m) < r(F, ») — ; for all F. 


and Theorem 2.2 is proved. 
THEOREM 2.3. If Dis compact, and if Conditions 2.1, 2.2, 2.4, 2.5 are fulfilled, 
then there exists a minimax solution, i.e., a decision rule no for which 


STATISTICAL DECISION FUNCTIONS 185 


(2.59) Sup r(F, m) < Sup r(F, ») for all ». 
Fr F 


To prove the above theorem, we shall first prove the following lemma. 

Lemma 2.4. If Dis compact and if Conditions 2.1, 2.2, 2.4, 2.5 are fulfilled, 
then for any sequence {ni} (¢ = 1,2, --- , ad inf.) of randomized decision functions 
for which r(F, ni) is a bounded function of F and 1, there exists a subsequence {1:;} 
(j = 1, 2,--- , ad inf.) and a randomized decision function no such that 


(2.60) lim inf r(é, ni;) 2 r(&, no) for all &. 


Proor: Let ¢; = ¢(n:) (defined in Section 2.3) be given by {z,,:(z', --- , 2")} 
and {6,122...2r,4} (r = 1, 2,---, ad inf.). Thus, z,.(2',---, 2”) is the con- 
ditional probability that we shall take an observation on X’* using the rule 
n; and knowing that the first r observations are given by x’, --- , x’ and thatex- 
perimentation was not terminated for (2', --- , z*) (k <r). As stated in section 
2.3, for any r, 2’, --- , x” the symbol 6,1...2*,; denotes the conditional probability 
distribution of the selected d when 7; is used and is known that the first r ob- 


servations are equal toz’, --- , x” and that n(x) = r. Since there are only count- 
ably many finite samples (z’, --- , x”), it is possible to find a subsequence {7;} of 
{2} such that lim 2r,i,(2'y -++, 27) and lim 6a1...2°,:; exist.” Put 
j= je 
(2.61) lim Sritg( s oes a") = 2 o(r', ++: , 2’) 
)=o 
and 
(2.62) lim Oz1...27 i; = Ox1...27,0 . 
7=o 


As shown in section 2.3, there exists a randomized decision function such 
So = £(m) is given by {z-0(2', ---, 2°)} and {6z...20}. Let g-s(x, +++ , 2" | £) 
denote the probability that the sample (2’, --- , 2’) will be obtained and that 
experimentation will be stopped at the r-th observation when é is the a priori 
distribution and 7; is the decision rule used by the statistician. For any sample 
(c', --- , 2”) let Ri(x', --- , 2”) denote the expected value of W(F, d) when the 
distribution of F' is equal to the a posteriori distribution of F as implied by é 
and (a',--:- , 2”) and where d is a chance variable independent of F with the 
probability distribution 6,1....°,,. Since, r(é, mi) is bounded by assumption, 
the probability that experimentation will go on indefinitely is equal to zero. 
From this it follows that 
(2.63) Zz: ’ Gr,i(x', +++, 2° |£) = 1 for all é. 


r,zi- o%, 


5 The existence of lim 6z1..-z7,;; follows from the compactness of D (see Theorem 3.6 
7=o 


in [3)). 








186 ABRAHAM WALD 


Then r(é, 7:) is given by 


r(&, 03) 


(2.64) | 


[ c(a", +++ , 2", D) dns 
= > Qr,i(, °° 2" | ye eva) +> ‘ 


Qzi...zr 
[ dni 
Qzi...z? 


where Q.1...:7 is the totality of all decision functions d(x) for which n(y) = r 
whenever y' = x',--:, y’ = x. Clearly, 


(2.65) lim Qr.ij (2, ae x"| £) oT qr.0(2', ey x lg). 


= 





Helse: 


Since D is compact and since W(F, d) is a continuous function of d uniformly 
in F (in the sense of the metric defined in D), we have 


(2.66) lim Ri,(z', oes 2") = Roz’, --- , 2’). 


jmwo 


From Condition 2.5 it follows that 


/ e(z’, vere 2’, D) dni; / e(z', aa x D,o) dno 
Qrl.--zr 1 r . 





(2.67) lim S222? —________ = “Gls 
; / dni; / dno 
Qrzrl---zr Qri--+2f 
Lemma 2.4 is an immediate consequence of the equations (2.64) — (2.67). 


We are now in a position to prove Theorem 2.3. Because of Theorem 2.1 
there exists a sequence {7;} such that 
(2.68) lim Sup r(F, 7:) = Inf Sup r(F, 7). 
t=20 F ” F 
According to Lemma 2.4 there exists a subsequence {7:;} (j = 1, 2, --- , ad inf.) 
and a randomized decision function 7 such that 


(2.69) lim inf r(F, 7:;) 2 r(F, no) for all F. 


j=00 


It follows from (2.68) and (2.69) that 7 is a minimax solution and Theorem 
2.3 is proved. : 

THEOREM 2.4. If D is compact and if Conditions 2.1, 2.2, 2.4, 2.5 are fulfilled, 
then for any & there exists a Bayes solution relative to &. 

This theorem is an immediate consequence of Lemma 2.4. 

We shall say that m is a Bayes solution in the wide sense, if there exists a 


sequence {&;} (¢ = 1, 2, --- , ad inf.) such that 
(2.70) lim [r(: , 7) — Inf r(é:, n)] = 0. 
t=20 n 


We shall say that m is a Bayes solution in the strict sense, if there exists a & 
such that m is a Bayes solution relative to &. 


STATISTICAL DECISION FUNCTIONS 187 


THEOREM 2.5. If D is compact and Conditions 2.1-2.5 hold, then the class of all 
Bayes solutions in the wide sense is a complete class. 

ProoF: Let m be a decision rule that is not a Bayes solution in the wide sense. 
Consider the weight function W*(F, d) = W(F, d) — r(F, m). We may assume 
that r(F, m0) < © for at least some F, since otherwise there obviously exists a 
Bayes solution in the wide sense that is uniformly better than m. Then it 
follows easily from (2.44) and Lemmas 2.1 and 1.3 that 


(2.71) Sup Inf r*(é, ») = Inf Sup r*(é, 7) = v* (say), 
g 2 2 g 


where r*(é, 7) is the risk corresponding to W*(F, d), i.e., 
(2.72) r¥(é, n) = r(é, 0) — r(&, m0). 


Theorem 2.3 is clearly applicable to the risk function r*(é, 7). Then, there 
exists a minimax solution 7; for the problem corresponding to the new weight 
function W*(F, d). Since, because of 2.72, v* < 0, we have 


(2.73) r*(é,m) = r(&, m) — r(&, m0) S 0 for all &. 


Our theorem is proved, if we can show that m is a Bayes solution in the wide 


sense. Let {&:} (¢ =' 1, 2, --- , ad inf.) be a sequence of a priori distributions 
such that 


(2.74) lim Inf r*(é;, 7) = 0*. 
i=0o7n 
Since m is a minimax solution, we must have 
(2.75) r*(&,m) S v*. 


It follows from (2.74) and (2.75) that m is a Bayes solution in the wide sense 
and our theorem is proved. 

We shall now formulate an additional condition which will permit the deriva- 
tion of some stronger theorems. First, we shall give a convergence definition 
in the space 2. We shall say that F; converges to F in the ordinary sense if 


(2.76) lim pa, -::,2| Fs) = pa’, ++: , 2" |F) (r = 1,2,--- , ad inf.). 


Here p,(z', --- , 2” | F) denotes the probability, under F, that the first r observa- 
tions will be equal to x’, --- , 2", respectively. We shall say that a subset w 
of © is compact in the ordinary sense, if w is compact in the sense of the conver- 
gence definition (2.76). 

ConpiTION 2.6. The space 2 is compact in the ordinary sense. If F; con- 
verges to F, asi — ~, in the ordinary sense, then 

lim W(F; ,d) = W(F, d) 

uniformly in d. 

THeoreM 2.6. If D is compact and if Conditions 2.1, 2.2, 2.4, 2.5, 2.6 hold, 
then: 








188 ABRAHAM WALD 


(i) there exists a least favorable a priori distribution, i.e., an a priori distribution 
£) for which 


Inf r(f , 4) = Sup Inf r(é, 7). 
n g 7 

(ii) A minimax solution exists and any minimax solution is a Bayes solution 
in the strict sense. 

(iii) If yo ts a decision rule which is not a Bayes solution in the strict sense and 
for which r(F, no) is a bounded function of F, then there exists a decision rule m 
which is a Bayes solution in the strict sense and is uniformly better than no . 

Proor: Let {&:} (¢ = 1, 2, --- , ad inf.) be a sequence of a priori distributions 
such that 
(2.77) lim Inf r(é;, 7) = Sup Inf r(é, 7). 

t=o 7 gE ” 

Since 2 is compact in the ordinary sense, there exists an a priori distribution 

f) and a subsequence {£,,;} or {&} such that 


(2.78) lim §i,(@) = £o(w) 


for any subset w of 2 which is open (in the sense of the ordinary convergence 
definition in 2) and for which &(w*) = 0, where w* denotes the set of all boundary 
points of w. We shall show that & is a least favorable distribution. Assume 
that it is not. Then there exists a decision function Dy) = d(x) such that 


(2.79) r(fo ? Do) Ss a 6, 


where 6 > 0 and v denotes the common value of Sup Inf rand Inf Sup r. It was 
g ” * 

shown in the proof of Lemma 2.1 that (2.79) implies the existence of a decision 

function D,; = d(x) and that of a positive integer m such that 


(2.80) n(x;D.) S mforallz 
and 
(2.81) r&,D) Sv— 5. 


Since c(z’, --- , 2", D:) and W(F, d) are uniformly bounded and W(F, d) is 
continuous in F uniformly in d, we have 


(2.82) limr (F;, D1) = r(F, D1) 


t=00 


for any sequence {F;} for which F; — F in the ordinary sense. From (2.78), 
(2.82) and the compactness of © (in the ordinary sense) it follows that 


(2.83) limr (&;, Di) = rl, Di) Sv 


j= 


nie 


STATISTICAL DECISION FUNCTIONS 189 


But this is in contradiction to (2.77) and, therefore, & must be a least favorable 
distribution. Hence, statement (i) of our theorem is proved. 

Statement (ii) is an immediate consequence of Theorems (2.1), (2.3) and state- 
ment (i) of Theorem (2.6). 

To prove (iii), replace the weight function W(F, d) by W*(F, d) = 
W(F,d) — r(F, no) where 7 satisfies the conditions imposed on it in (iii). 

We shall show that (i) remains valid also when W(F, d) is replaced by 
W*(F,d). This isnot clear, since W*(F, d) may not be continuous in F. First 
we shall prove that 
(2.84) lim inf r(&, m0) 2 (fo, m) 


, . = fe ‘ ° . . 
for any sequence {&;} for which £; — & in the ordinary sense, i.e., for which 


(2.85) lim &(w) = £(w) 


=O 


for any open subset w (open in the sense of ordinary convergence defined in 2) 
“1° . ‘ 
whose boundary has probability measure zero according to &. For any sample 


a, +: ,2" let q,i(z’, --* , 2”) denote the probability that the first r observations 
will be equal to 2’, --- , 2”, respectively, when £; is the a priori distribution. 
Clearly, 
¢ . 1 r 1 , -/ 
(2.86) avila’, «++ 52°) = ff palat, +++ 2" | F) as 
Since p,(a', --- , 2” | F) isa continuous function of F, we have 
(2.87) lim qri(a', 2) = Qro(2', sos a’). 
t=0O 


The function r(é, no) can be split into two parts, 1.e., r(é, m0) = ri(é, m0) + 72(, 10) 
where 7; is the expected value of the loss W(F, d) and r2 is the expected cost of 
experimentation. Since W(F, d) is a bounded function of F and d, and since 
W (F, d) is continuous in F uniformly in d, we have 


(2.88) lim rij m0) = ra(Eo , 0) 


for any sequence {£;} which satisfies (2.85). To prove (2.84), we merely have 
to show that 
" ‘ ° ’ ’ 
(2.89) lim inf re(é; ’ No) = To(ko ’ No). 
t=co 


But 
290) rQm) = Danley) feat, ++ 255 D) dew 
rzl,--.,27 Qzi-- ear 


where Q,1...27 is the totality of all decision functions d(x) with the property 
that d(y) = r for any y whose first r coordinates are equal to x’, --- , 2", respec- 
tively. Equation (2.89) is an immediate consequence of (2.87) and (2.90). 
Hence, (2.84) is proved. 








190 ABRAHAM WALD 


Let r*(é, 7) be the risk function when W(F, d) is replaced by W*(F, d), i.e., 
r*(E, n) = r(&, 2) — r(é, m). Let, furthermore, {£7} be a sequence of a priori 
distributions such that 


° * ‘ is 
(2.91) lim Inf r*(é;, ») = Sup Inf r*(é, 7). 

i=o 9 g n 
a . * * _* ; 
here exists a subsequent {£;;} of the sequence {£;} such that &;, converges (in 


‘ oe ‘ i ke * . - 

the ordinary sense) to a limit distribution & as 7 — ©. We shall show that 
*. ‘ = “ *. 

£ is a least favorable distribution. For suppose that & is not a least favorable 
% ‘. i a ‘ — ‘ . * * 

distribution. Then there exists a decision function Dp = do (x) such that 


(2.92) r*(t&, Do) S v* — 8 
where 6 > 0 and v* = Sup Inf r* = Inf Sup r*. But then there exists a decision 


g n n g 
e * * unt e 
function D; = d;(«) and a positive integer m such that 


(2.93) n(a; Dr) < mfor all x 
and 
(2.94) (8, DI) < o* — f. 


Since r*(é, Dr) = r(£, Dr) — r(é, mo), and since 


lim r(éi, , D1) = r(&, D1); 


a= 


it follows from (2.84) and (2.94) that 


(2.95) lim sup r*(¢%,, DI) < o* — 5 

j=mco 
which is in contradiction to (2.91). Hence, the validity of (i) is proved also 
when W(F, d) is replaced by W*(F, d). Clearly, also (ii) remains valid when 
W(F, d) is replaced by W*(F, d). 

Let m be a minimax solution relative to the problem corresponding to 
W*(F, d). Then because of (ii), m is a Bayes solution in the strict sense. 
Since 7 is not a Bayes solution in the strict sense, 7, ~ mo and v* < 0. Hence 
m is uniformly better than 7. ° This completes the proof of Theorem 2.6. 

We shall now replace Condition 2.6 by the following weaker one. 

ConpITION 2.6*. There exists a sequence {Q;} (¢ = 1, 2,--- , ad inf.) of 
subsets of 2 such that Condition 2.6 is fulfilled when Q is replaced by Q; , Qis, D Q; 
and lim; = Q. 


t= 


We shall say that »; converges weakly to 7 ast — o, if lim ¢(ni) = ¢(n). 


We shall also say that 7 is a weak limit of n;. This limit definition seems to be 
natural, since r(é, m) = r(&, ne) if (m2) = €(m). We shall now prove the follow- 
ing theorem: 


sce 


ri 


— 


STATISTICAL DECISION FUNCTIONS 191 


THEOREM 2.7. If D is compact and if Conditions 2.1, 2.2, 2.4, 2.5 and 2.6* are 
fulfilled, then: 

(i) A minimax solution exists that is a weak limit of a sequence of Bayes solu- 
tions in the strict sense. 

(ii) Let no be a decision rule for which r(F, no) is a bounded function of F. Then 
there exists a decision rule m that is a weak limit of a sequence of Bayes solutions 
in the strict sense and such that r(F, m) S r(F, no) for all F in Q. 

Proor: According to theorem 2.6, there exists a decision rule 7; that is a Bayes 
solution in the strict sense and a minimax solution if 2 is replaced by 2;. There 
exists a subsequence {7:;} (7 = 1, 2, --- , ad inf.) of the sequence {7;} such that 
{ni;} admits a weak limit. Let m be a weak limit of {;;}. Then, as shown in 
the proof of Lemma 2.4, equation (2.60) holds and 7 is a minimax solution rela- 
tive to the original space 2. Thus, statement (i) is proved. 

To prove (ii), replace W(l’, d) by W*(F, d) = W(F, d) — r(F, 1). Accord- 
ing to Theorem 2.6 there exists a decision rule m; such that m1; is a minimax solu- 
tion and a Bayes solution in the strict sense when © is replaced by 2; and W(F, d) 
by W*(F, d). Clearly, mi: remains to be a Bayes solution in the strict sense also 
relative to 2 and W(F, d). Since m; is a minimax solution relative to 0; and 
W*(F, d), we have 


(2.96) r(F, mi) s r(F, No) for all F in Q;. 


Let {mi;} be a subsequence of the sequence {mi} such that {mi;} admits a weak 
limit m. Then, (2.60) holds for {m:;} and m , and 


(2.97) r(F, m) S r(F, no) for all F in Q. 


Since 7 is a weak limit of strict Bayes solution, statement (ii) is proved. 


3. Statistical decision functions: the case of continuous chance variables. 

3.1. Introductory remarks. In this section we shall be concerned with the 
case where the probability distribution F of X is absolutely continuous, i.e., 
for any element F of © and for any positive integer r there exists a Joint density 
function p,(z',--- , 2” | F) of the first r chance variables X’, «++ , X’. 

The continuous case can immediately be reduced to the discrete case discussed 
in section 2 if the observations are not given exactly but only up to a finite num- 
ber of decimal places. More precisely, we mean this: For each 7, let the real 
axis 2 be subdivided into a denumerable number of disjoint sets Ra, Riz, +--+ , 
adinf. Suppose that the observed value x‘ of X* is not given exactly; it is merely 
known which element of the sequence {R,;} (gj = 1, 2,--- , ad inf.) contains 
a’. This is the situation, for example, if the value of x’ is given merely up to a 
finite number, say r, decimal places (r fixed, independent of 7). This case can 
be reduced to the previously discussed discrete case, since we can regard the 
sets R;; as our points, i.e., we can replace the chance variable X" by Y* where 
Y' can take only the values Ra, Rig, «++ , ad inf. (Y* takes the value R;; if X‘ 
falls in R;;). If W(Fi, d) = W(F2, d) whenever the distribution of Y under 








192 ABRAHAM WALD 


F, is identical with that under F, , only the chance variables Y’, Y’, --- , ete. 
play a role in the decision problem and we have the discrete case. If, the latter 
condition on the weight function is not fulfilled, i.e., if there exists a pair (F; , Fs) 
such that W(F; , d)  W(F2, d) for some d and the distribution of Y is the same 
under F as under F2 , we can still reduce the problem to the discrete case, if in 
the discrete case we permit the weight W to depend also on a third extraneous 
variable G, i.e., if we put W = W(F, G, d), where G is a variable about whose 
value the sample does not give any information. The results obtained in the 
discrete case can easily be generalized to include the situation where W = 
W(F, G, d). 

In practical applications the observed value 2’ of X°* will usually be given 
only up to a certain number of decimal places and, thus, the problem can be 
reduced to the discrete case. Nevertheless, it seems desirable from the theo- 
retical point of view to develop the theory of the continuous case, assuming 
that the observed value x’ of X* is given precisely. 

In section 2.3 an alternative definition of a randomized decision rule was given 
in terms of two sequences of functions {z,(2', --- , x”)} and {6z1...2r} (r = 1, 2, 

- ,ad.inf.). Weused the symbol ¢ to denote a randomized decision rule given 
by two such sequences. It was shown in the discrete case that the use of a 
randomized decision function 7 generates a certain ¢ = ¢(n), and that for any 
given ¢ there exists an 7 such that ¢ = ¢(n). Furthermore, because of Condition 
2.5, in the discrete case we had r(F, m) = r(F, nz) if €(m) = €(m). It would be 
possible to develop a similar theory as to the relation between ¢ and 7 also in the 
continuous case. However, a somewhat different procedure will be followed for 
the sake of simplicity. Instead of the decision functions d(x), we shall regard 
the ¢’ sas the pure strategies of the statistician, i.e., we replace the space Q of 
all decision functions d(x) by the space Z of all randomized decisions rules ¢. 
It will then be necessary to consider probability measures 7 defined over an 
additive class of subsets of Z. It will be sufficient, as will be seen later, to con- 
sider only discrete probability measures 7. A probability measure 7 is said to be 
discrete, if it assigns the probability 1 to some denumerable subset of Z. Any 
discrete 7 will clearly generate a certain ¢ = ¢(n). In the next section we shall 
formulate some conditions which will imply that r(F, m) = r(F, m) if &(m) = 
¢(m). Thus, it will be possible to restrict ourselves to consideration of pure 
strategies ¢ which will cause considerable simplifications. 

The definitions of various notions given in the discrete case, such as minimax 
solution, Bayes solution, a priori distribution ~ in Q, least favorable a priori dis- 
tribution, complete class of decision functions, etc. can immediately be ex- 
tended to the continuous case and will, therefore, not be restated here. 

3.2 Conditions on 2, D, W(F, d) and the cost function. In this section we shall 
formulate conditions similar to those given in the discrete case. 

ConpiTIon 3.1. Each element F of Q is absolutely continuous. 

ConpiTIoNn 3.2. W/(F, d) is a bounded function of F and d. 

ConpDITION 3.3. The space D is compact in the sense of its intrinsic metric 
5(d, , dz ; 2) (see equation 2.2). 


STATISTICAL DECISION FUNCTIONS 193 


This condition is somewhat stronger than the corresponding Condition 2.3. 
While it may be possible to weaken this condition, it would make the proofs of 
certain theorems considerably more involved. 

ConpiTIon 3.4. The cost of experimentation c(x', --- , x”) does not depend on 
¢. It is non-negative and lim c(x', --- ,2”") = © uniformly inz',---,2”. For 


mano 


each positive integral value m, c(x', --- , 2”) is a bounded function of x', +--+ , x”. 

This condition is stronger than Conditions 2.4 and 2.5 postulated in the dis- 
crete case. The reason for formulating a stronger condition here is that we wish 
the relation r(F, m) = r(F, m2) to be fulfilled whenever ¢(m) = ¢(72) which will 
make it possible for us to eliminate the consideration of 7’s altogether. Since 
the ¢’s are regarded here as the pure strategies of the statistician, it is not clear 
what kind of dependence of the cost on ¢ would be consistent with the require- 
ment that r(F, m) = r(F, m2) whenever ¢(m) = ¢(m). 

We shall say that Ff; — F in the ordinary sense, if for any positive integral 
value m 


lim D(x, -*:,2"|F;) dx’ +++ dz™ = Pu(x,-+:,2"| F) dz’ --- dx” 
too SS, 
uniformly in S,, where S,, is a subset of the m-dimensional sample space. 

ConpiTIon 3.5. The space © is separable in the sense of the above convergence 
definition.® 

No such condition was formulated in the discrete case for the simple reason 
that in the discrete case 2 is always separable in the sense of the convergence 
definition given in (2.76). 

3.3. Some lemmas. We shall first give a convergence definition in the space 
Z of all ¢’s which is somewhat different from the one given in the discrete case. 
Let h,(z', x’, --- , 2”, D*) denote the probability that experimentation will be 
terminated with the rth observation and that the final decision d selected will 
be an element of D*, knowing that the first r observations are equal to x’, ++ , 2’, 
respectively. That is, 


h(x no x ? D*) — zi(x")zo(x" ? x’) 
- Zpa(a, +++, a" ")(L — a(x", +++, 2”))8a1...2r(D*). 
Clearly, the functions h,(2', --- , x", D*) are non-negative and satisfy the follow- 


ing conditions: 


(3.2) >Oh,(2', --- , 2°, D*) < 1 for any D* and for any sample (z’, --- , 2”). 
r=] 


(3.3) Do Adz’, --+, 2°, Dj) = blz’, ---, 2, D*), 
7=1 


0 


if >) D; = D* and Di , Dz , «++ , ete. are disjoint. 


j=l 


¢ For a definition of a separable space, see F. Hausdorff, Mengenlehre (8rd edition), p. 125. 








194 ABRAHAM WALD 


One can easily verify that for any sequence of non-negative functions 
th-(x', +++ , 2", D*)} (r = 1, 2, ---) satisfying (3.2) and (3.3) there exists exactly 
one sequence {z,(z', «++ , 2”)} and one sequence {6,1...., (D*)} such that (3.1) 
is fulfilled. Thus, a randomized decision rule ¢ can be given by a sequence 
th, (x!, +--+ , 27, D*)} satisfying (3.2) and (3.3). The functions z,(z', --- , 2") and 
6z1...2r need be defined only for samples 2’, --- , x7 for which z,(z!, --- , x‘) > 0 
for z = 1,---,r — 1. The above mentioned uniqueness of 2,(z!, --- , 2’) 
and 6,1..... was meant to hold if the definition of these functions is restricted 
to such samples z!, --- , x”. 


For any bounded subset S, of the r-dimensional sample space, let 
‘ 1 , 1 
(3.4) H,(S,, D*) = [ hj(a!, «++, 2°, D*) dv --+ dt’. 
Sy 
Let {¢:}(@ = 0, 1, 2,---, ad inf.) be a sequence of decision rules, 


and H,;(S,, D*) be the function H,(S,, D*) corresponding to ¢;. We shall 
say that 


(3.5) lim $j = fo 
if 
(3.6) lim H,,:(S,, D*) = Hr o(S, , D*) 


for any r, any bounded set S, and for any D* that is an element of a sequence 
{Dx,,---%:} (kj = 1,-°+ 77359 = 1,°°-,1;1 = 1, 2, --- , ad inf.) of subsets 
of D satisfying the following conditions: 


rl 
(3.7) D Di, = D; Dd. Day---t, = Deyes--ty1 > 
ky=1 kt 
(3.8) Devitt gg PIE a are disjoint, 
and 
(3.9) the diameter of D;,...., converges to zeroas!— « uniformly in /y , +--+ , fy. 
Lema 3.1. For any sequence \¢:}(¢ = 1, 2,--- , ad inf.) of decision rules 
there exists a subsequence {fi;} (9 = 1,2, +--+ , ad inf.) and a decision rule > such 
that lim $i; = $0. . 
7=0 
Proor: Let H,,:(S,, D*) (r = 1, 2, --- , ad inf.) be the sequence of functions 


° ° . * . 
associated with ¢;. Let, furthermore, {D,,...x,} be a sequence of subsets of D 
satisfying the relations (3.7), (3.8) and (3.9). Clearly, for any fixed r and any 
> * > * ° . ° . 
fixed element D,,...., of the sequence {D,,...4,}, it is possible to find a subse- 


quence {7;} (7 = 1, 2,--- , ad inf.) of the sequence {7{ (the subsequence {7;} 
may depend on r and D,,,...,,) and a set function H, o(S,) such that 

~ oa * ‘ ’ 
(3.10) lim H,,;;,0S,, Dry...) = Heo(S,). 

j=0o 


Using the well known diagonal procedure, it is therefore possible to find a fixed 


oe DSS 


~ 7 
Ses — 


eee 


STATISTICAL DECISION FUNCTIONS 195 


subsequence {2;} (independent of r and D*) and a sequence of set functions 
* 
{H,o(S,, Dz,..-r,)} such that 


(3.11) lim H,,:,(S; , Diy..-e:) = H,o(S- , Diy..t;) 
7=n 
for all values of r,/y, +--+ , it; and l. 


To complete the proof of Lemma 3.1, it remains to be shown that 

there exists a decision rule ¢) such that the associated function H,(S,, D*) is 
i . . ‘ * . as 

equal to H,o(S,, D*) for any D* that is an element of {D,,....,}. Since 


1 ° ° . ’ a ° 
h(a, +++, 2, D*) is uniformly bounded, the set function H,(S,, Dg,...x,) is 
absolutely continuous. Hence for any values of it; , --- , k,; there exists a fune- 
: 1 + 

tion h,o(t,--:, 2, Dz,...%,) such that 


(3.12) [ heala!, ++, 2", DE..u)dt" +++ de’ = Hys(S,, DE...n,). 
Sy 


The existence of a ¢) with the desired property is proved, if we show that the 
functions h,o(z',-::, 2’, Di, ..-k:) satisfy the relations (3.2) and (3.3). Let 
h(x’, -+-, 2”, D*) = h,(2', «++ , 2’, D*) for any m > r. Then, since the fune- 
tions h,,; satisfy (3.2), we have 


(3.13) Zz Hy, (Sm, D*) > V(S,) 
r=] 


where V(S,.) denotes the m-dimensional Lebesgue measure of S,. From (3.13) 
it follows that 


(3.14) dX H,.o(Sm, Di,..-e:) & V(Sm)- 


Hence, the functions h,.o(2', --- , x, Di,...x,) must satisfy (3.2) except perhaps 
on a set of Lebesgue measure zero. Since the functions h,,;(2', --+ , 2", D*) satisfy 
(3.3), we must have 


rl 
(3.15) H,,(S,, Dy.) = Zz HAS, , Desa) 
a 


Hence, the same relation must hold also for H,o(S,, Dj, ,)» But this implies 
that the functions h,o(z', --- , 2’, Dj, ..-r;) satisfy (3.3) except perhaps on a set 
of Lebesgue measure zero, and the proof of Lemma 3.1 is completed. 

Lemma 3.2. Let T;(S) (¢ = 0, 1, 2, ---) be a non-negative, completely additive 
set function defined for all measurable subsets S of the r-dimensional sample space 
M,. Assume that 


(3.16) T.(S) = V(S) 
for all S (¢ = 0, 1, 2, --- , ad inf.) where V(S) denotes the Lebesgue measure of S. 
Let, furthermore, g(x’, --- , 2") be a non-negative function such that 


(3.17) / g(a’, ++, 2") dx +++ dx < @, 
M; 








196 ABRAHAM WALD 


Tf 

(3.18) lim T;(S) = To(S) 

then 

(3.19) lim / g(x', +++, 2) dT; = / g(x’, +++, 2") dT. 
i=o J My, M, 


Proor: Let M,,. be the sphere in M, with center at the origin and radius ec. 
Clearly, 


(3.21) lim / g(a’, -+-,2") dz’ --- dx = g(x’, +++, 2") dx +--+ dv’. 
cme Y My My 
Hence, because of (3.16), we have 
(3.21) lim lf g(a, -++,2') dT; — g(a’, +++, 2’) ar.| = @ 
c=c0 Mr,c Mr 
uniformly in 2. Hence our lemma is proved if we show that 
(3.22) lim f g(a’, ++, 2°) dP = | g(x’, --- 2) aT. 
t=00 Mri Mr,c 
for any finite c. Let ga(x",--- , 2") = g(x’, +--+, 2’) when g(z',---, 2") S A, 


and = O otherwise. Since 


lim (g — ga) dx’ --- dx’ = 0 

A=o JM, 
it follows from (3.16) that 
(3.23) lim / (g — gs) dT; = 0 

A=0 I My 
uniformly inz. Hence, our lemma is proved if we can show that 
(3.24) lim / gi, dT; = / ga aT 
imoo JM, . Myc 

for any c > Oandany A > 0. Let S; denote the set of all points in M,,, for 
which , 


(3.25) G-—leSga<je 


where ¢€ is a given positive number. We have 


(3.26) x (j-1) ef qT; = I, 


Since for any ¢, 7 can take only a finite number of values, and since e can be 
chosen arbitrarily small, our lemma follows easily from (3.18) and (3.26). 
LemMMA 3.3. Let {f:} be a sequence of decision rules such that lim ¢; = { and 


too 


gi dT; <> je] ATi, G@ = 0,1,2, --°). 
7 8; 


r,c 


\e 
. 


STATISTICAL DECISION FUNCTIONS 197 


r(F, €:) ts a bounded function of F andi (i = 1). Then 
(3.27) lim inf r(F, ¢:) 2 r(F, $0). 


Proor: First we shall show that it is sufficient to prove Lemma 3.3 for any 
finite space D. For this purpose, assume that Lemma 3.3 is true for any finite 
decision space, but there exists a non-finite compact decision space D and a 
sequence {f;} such that lim ¢; = { and 


t= 


(3.28) lim inf r(F, ¢:) = r(F, fo) — 6 for some F(6 > 0). 


Since ¢; — fo, there exists a sequence {D,,....,} of subsets of D satisfying the 
conditions (3.7)-(3.9) and such that 


(3.29) lim H,,i(S; , Diy...) = Hro(Sr , Diy.--&) 

where H,,;(S, , D*) is the function H, associated with ¢;(¢ = 0,1, 2,---). Let 
\ be a fixed value of / and consider the corresponding finite sequence {D,,...x,} 
of subsets of D. Let k be the number of elements in this finite sequence. We 
select one point from each element of the finite sequence {Dj,....,}. Let the 


points selected be d; , d2 , --- , d, and let D denote the set consisting of the points 
d,,---, d&. Let & be the decision rule defined as follows: the function 
h(x’, «++ , x, d;) associated with §; is equal to h,,i(2", «++ , 2’, D}) where D} is 


equal to the element of the finite sequence {D,,....,} which contains the point 
djj = 1,---,k). Clearly, because of (3.29), 


(3.30) lim §; = &. 


Furthermore, for sufficiently large \ we obviously have 
(3.31) | r(F, ¢:) — r(F, &) | S efor? = 0,1, 2,--- , ad inf. 
Since for finite D our lemma is assumed to be true, we have 


(3.32) lim inf r(F, &:) = r(F, &). 


t=00 


Choosing e S$ ‘ , We obtain a contradiction from (3.28), (3.31) and (3.32). Thus, 


it is sufficient to prove Lemma 3.3 for finite D. In the remainder of the proof 
we shall assume that D consists of the points d,,--- , d&. 

The probability that we shall take exactly m observations when ¢; is used and 
F is true is given by 


prob. {n = m| F, ¢:} 


(3.33) 
” / Dn(x, +++ 0” | F)Rmi(z', «++ 2”, D)dx', +++ dx™ 
Sis 





198 ABRAHAM WALD 


where M,, denotes the m-dimensional sample space. Since 


lim Hm i(Sm ’ D) = Hmo(Sm ; D), 


t=00 
it follows from Lemma 3.2 that 


(3.34) lim prob {n = m| F, ¢:} = prob {n = m| F, gp}. 


100 
Hence 


(3.35) lim prob {n S m! F,¢:} = prob {n S mj F, fo}. 
t=00 
Since r(F, ¢;) is a bounded function of F andi (¢ = 1), we must have 


(3.36) lim prob {n S m/|F,¢:} = 1 (¢ = 1,2,---) 


uniformly in F andi. From (3.35) and (3.36) it follows that 


(3.37) lim prob {n S m|F, f} = 1 


m=e 


uniformly in F. Because of (3.36) and (3.37), we have 


(3.38) r(F, £3) = Le Tn(F, fi) (i = 0, 1, 2, ae ad inf.), 
m=1 


where 


k 
,(F,t)=> / Dalz, «++, 2"|F)W(F, di) dHm.s(Sm, di) 
l=1 J Mm 


(3.39) 
4 / pala, +++, 2™| Fye(z!, ---, 2") dHw.(Su, D). 


Since 


lim Hn,i(Sm,D*) = Hno(Sm, D*) 


for any subset D* of D, it follows from Lemma 3.2 that 
(3.40) lim nlf, 9) — Tall’, $0). 


Lemma 3.3 is an immediate consequence of (3.38) and (3.40). 

3.4. Equality of Sup Inf r and Inf Sup r, and other theorems. In this section 
we shall prove the main theorems for the continuous case, using the lemmas de- 
rived in the preceding section. 

THEOREM 3.1. Jf Conditions 3.1-3.5 are fulfilled, then 


(3.41) Sup Inf r(é, ¢) = Inf Sup r(é, ¢). 
. 2 


Proor: Let Z” denote the class of all ¢’s for which prob {n S m|f¢, F} = 1 


for all F. We shall denote an element of Z” by ¢”. First we shall show that it 





STATISTICAL DECISION FUNCTIONS 199 


is sufficient if for any finite m we can prove Theorem 3.1 under the restriction 
that ¢ must be an element of Z”. For this purpose, put Wy = Sup W(F, d) 
Fd 


and choose a positive integer m, so that 


(3.42) (at, «-+ 2") > We 


for allm = m. The existence of such a value m, follows from Condition 3.4. 
We shall now show that for any £ we have 


(3.43) Inf r(é, ¢") S Inf r(é, ¢) + efor any m= m,. 
ym t 


Let §; be any decision rule. There are two cases to be considered: (a) prob 


" € 
in & mel & S4) 2 5 (b) prob {n 2 ml & 1} < 
0 


a In case (a) we have 
0 

r(é, &:) 2 Wo. In this case, let f be the rule that we decide for some d without 
taking any observations. Clearly, we shall have r(é, f&) < Wo and, therefore, 
r(&, f2) S r(é, &). In ease (b), let ¢ be defined as follows: h,(z’, --- , 2”, D*) 
for ¢2 is the same as that for ¢ when r < m., andh,(z', --- , 27, do) for & is equal 


me—l1 
k . 
tol — > | h(x’, «++ , 2*, D) when r = m., and zero when r > m, where d is a 
k=1 


fixed element of D. Since prob {n 2 m,|&, fi} < or , we have 
0 


r(é, §2) S r(é, $1) + €. 


In both cases f2 is an element of Z”«. Hence (3.43) is proved. From (3.43) we 
obtain 


(3.44) ~-_ - rs = Sup Infr s = Sup ~ r+e. 


foe 
Assume now that 


(3.45) Sup Inf r = Inf Sup r 
gym m 


holds for any m. From (8.44) and (3.45) we obtain 


(3.46) Inf Sup r S Sup Infr + «. 
rme f . ¥ 


Hence 


(3.47) Inf Sup r S Sup Inf r + «. 
f g E f 
Since this is true for any e, we have 


(3.48) Inf Sup r S Sup Inf r. 
5 Og . 2 


Theorem 3.1 follows from (3.48) and Lemma 1.3. 








200 ABRAHAM WALD 


To complete the proof of Theorem 3.1, it remains to be shown that (3.45) 
holds for any m. Since D is compact, (3.45) is proved if we can prove it for any 
finite D. In the remainder of the proof we shall, therefore assume that D con- 
sists of k pointsd,,--- ,d,. Let w bea subset of 2 that is conditionally compact 
in the sense of the metric’ 


(3.49) (Fi, F:) = Sup fo ar. — fo ary 
Sm |%Sm Sm | 


where S,, is a subset of the m-dimensional sample space. We shall show that w 
is conditionally compact also in the sense of the intrinsic metric given by 


(3.50) i(F1, Fs) = _ | r(Fi,¢") — r(Fe,§") | . 
Let 
(3.51) 62(F1, F2) = Sup | W(Fi, d) — W(F2, d) | 
d 
and 
(3.52) 63(F1, Fe) = d0(F1, Fe) + &(F 1, Fe). 


It follows from Condition 3.3 and Theorem 3.1 in [3] that 2, and therefore 
also w, is conditionally compact in the sense of the metric 6:(F, , F:). Hence w 
is conditionally compact in the sense of the metric 6;(F; , Fz). The conditional 
compactness of w relative to the metric 6:(F; , F2) is proved, if we can show that 
any sequence {F;} that is a Cauchy sequence relative to the metric 63 is a Cauchy 
sequence also relative to the metric 6,. Let {Fi} (¢ = 1, 2,---, ad inf.) bea 
Cauchy sequence relative to 6;. Then there exists a distribution Fy (not neces- 
sarily an element of 2) and a function W(d) such that 


(3.53) lim W(F;, d) = W(d) uniformly in d 
and 
(3.54) lim dF; = dF, 

t=o JSm Sm 


uniformly in S,. We have 
m k 
AF, i") = LD f 
r=] j=1 4M, 
(3.55) -p(z’, Se ag 2 | Fi)W(Fi, d)hy(a", 7 tom t d;) dx s+ dx’ 
+ > | e(x', +++, 2°)pla’, «++, 2° | Fon (2, ---, 2", D) dr’ --- de’, 
My 


r=1 





7 By [ dF we mean Pale! , >>> , a" | F)dz! --- dz™. 
Sm Sm 


Wl 


ae OP leet 


STATISTICAL DECISION FUNCTIONS 201 


where M, denotes the r-dimensional sample space. The sequence {Fi} is a 
Cauchy sequence relative to the metric 6, if there exists a function r(¢”) such that 
(3.56) lim r(F; , §") = r(¢") 

uniformly in Let 7(F;, ¢") be the function we obtain from r(F;, ¢”) by 
replacing the ‘ties W(F;, d;) by W(d;) under the first integral on the right 
hand side of (3.55). Because of (3.53), we have 

(3.57) lim [r(F; ’ o”) a 7(F; ? c”)) = 0 


T= 


- 


uniformly in ¢”. Thus, (3.56) is proved if we can show the existence of a func- 

tion 7(¢”) such that 

(3.58) lim #(Fs, $") = 70") 

uniformly in ¢”. Let C be a class of functions ¢g(z’, --- , 2”) such that 
lo(a',---,2")| <A < @ forallginC. 


It then follows from (3.54) that there exists a functional g(¢) such that 


(3.59) lim yg dF; = gly) 

t Mm 
uniformly in g. Application of this general result yields (3.58) immediately. 
Hence, {Fi} is a Cauchy sequence relative to the metric 6, and, therefore w is 
shown to be conditionally compact relative to the metric 6, if it is relative to 
the metric 6p. 


It then follows from Theorem 3.2 in [3] that te Inf r = - oe r if we replace 
E [= 
Q by a subset w that is conditionally compact relative to 69." Since Q is separable 


relative to dy, there exists a sequence {2;} of subsets of 2 such that Q; is condi- 
tionally compact relative to 6), Qi4. D Q; and > 2; = * is dense in®. Let 


' denote an a priori distribution £ for which &(Q;) = 1. Since the left and right 
hand members in (3.45) remain unchanged when @ is replaced by 0%*, it follows 
from Theorem 1.3 that equation (3.45) is proved if we can show that 


(3.60) lim Inf - r = Inf Supr. 
imo gm mg 
Let {¢7} (i = 1,2, --- , ad inf.) be a sequence of decision rules such that 
(3.61) lim [Sup r(¢’, ¢7) — Inf Sup 7] = 0. 
i=. gt gm gt 





n™ 
the space of all ¢”. But, since the use of any discrete probability measure is equivalent to 
the use of a ¢”, and since the restriction to discrete 7” does not change “—_— Inf ror Inf Sup r 
a” " = 


8 Strictly, we would have to write Inf instead of Inf where 7” is a probability measure in 
7m 


we can replace Inf by Inf. 
—” . 








202 ABRAHAM WALD 


According to Lemmas 3.1 and 3.3, there exists a subsequence {7;} of {7} and 
a decision rule ¢¢ such that 


(3.62) lim inf r(F, ¢7;) = r(F, £0) for all F. 


7=0 a 
Since 2* is dense in Q, it follows from (3.61) and (3.62) that 


(3.63) Sup r(F, fo) < lim Inf Sup r 
F imo gm gi 
and, therefore, 3.60 holds. Thus, (3.45) is proved and the proof of Theorem 
3.1 is completed. 
THeoreM 3.2. Jf Conditions 3.1-3.5 are fulfilled, then there cxisls a minimax 
solution, t.e., a decision rule {> for which 


(3.64) Sup r(F, &)) S Sup r(F, ¢) for all ¢. 
F F 


Proor: Because of Theorem 3.1 there exists a sequence {f:} (¢ = 1,2,--- , ad 
inf.) of decision rules such that 


(3.65) lim Sup r(F, ¢:) = Inf Sup r(F, ¢). 
i=0 FF c F 


According to Lemmas 3.1 and 3.3 there exists a subsequence {f:;} of {fi} anda 
decision rule ¢ such that 
3.66 lim inf r(F, ¢:;) 2 r(P, £0) for all F. 
j=0o 

It follows from (3.65) and (3.66) that ¢ is a minimax solution and Theorem 
3.2 is proved. 

THEOREM 3.3. Jf Conditions 3.1-3.5 are fulfilled, then for any é there exists a 
Bayes solution relative to &. 

This theorem is an immediate consequence of Lemmas 3.1 and 3.3. 

THEeoreM 3.4. If Conditions 3.1-3.5 are fulfilled, then the class of all Bayes 
solutions in the wide sense is a complete class. 

The proof is omitted, since it is entirely analogous to that of Theorem 2.5. 

3.5. Formulation of an additional condition. In this section we shall formulate 
an additional condition which will permit the derivation of some stronger 
theorems. Let the metric &(/1 , F,) be defined by 


2 


io (FF) = D-Sup|f ah—f ars| 
m Sm Sm 


m=) mM s§ 


where S,, may be any subset of the m-dimensional sample space. 
ConpbiTION 3.6. The space 2 is compact relative to the metric 6(F1, F2) 


lim W(F;,d) = W(Fo,d) 
uniformly in d if lim 6(F;, Fo) = 0. 


THEOREM 3.5. If Conditions 3.1-3.6 hold, then 


fe 


il 


—; <a FE TN 


d 


Aw 


STATISTICAL DECISION FUNCTIONS 203 


(i) there exists a least favorable a priori distribution 
(ii) any minimac solution is a Bayes solution in the strict sense 
(iii) for any decision rule ¢) which is not a Bayes solution in the strict sense and 


for which r(F, §) 7s a bounded function of F there exists a decision rule ¢, which is a 


Bayes solution in the strict sense and is uniformly better than &o . 

Proor: The proofs of (i) and (ii) are entirely analogous to those of (i) and (ii) 
in Theorem 2.6, and will therefore be omitted here. 

To prove (iii), let ¢) be a decision rule that is not a Bayes solution in the strict 
sense and for which r(/’, &) is bounded. We replace the weight function W(F, d) 
by W*(F, d) = W(F, d) — r(F, fo). We shall show that (i) remains valid when 
W(F, d) is replaced by W*(F,d). This is not obvious, since r(F, £0), and there- 
fore also W*(F, d) may not be continuous in F. First we shall prove that 
(3.67) lim inf r(& , fo) = (Eo , $0) 


for any sequence {£;} for which 

lim E(w) = £(w) 
for any open subset w of 2 (in the sense of the metric 69) whose boundary has 
probability measure zero according to t). Let rn(F, ¢) denote the conditional 
expected value of the loss W(F, d) plus the cost of experimentation when n = m, 
F is true and the rule ¢ is used by the statistician (see equation (3.39)). Since 
W(F, d) and the cost of experimentation when m observations are taken are 
uniformly bounded, one can easily verify that 


(3.68) lim tm( Fi ? re) = Tm(Fo ? to) 


t=00 
for any sequence {F;} for which 


(3.69) lim 6(F;, Fo) = 0. 


t=0o 


Hence, since 2 is compact (Condition 3.6), 


(3.70) lim rm(Ei » $0) = Tm(Eo , $0) 
where 
(3.71) rm(és $0) =f r(P, 50) a8. 
Since 


r(E, fo) = Dy rm(E, $0) 


m=1 
inequality (3.67) follows from (3.70). 
The remainder of the proof of (iii) will be omitted here, since it is the same 
as that of (iii) in Theorem 2.6. 








204 ABRAHAM WALD 


We shall now replace Condition 3.6 by the following weaker one. 


ConpDiITION 3.6%. There exists a sequence {Q;} (i = 1, 2,--- , ad inf.) of sub- 
sets of Q such that Condition 3.6 is fulfilled when Q is replaced by Q: , Qi41 DQ: and 
lim Q; = Q. 
1=0 

THEOREM 3.6. Jf Conditions 3.1-3.5 and 3.6* are fulfilled then 

(i) A minimax solution {> and a sequence {¢;} (i = 1, 2,--- , ad inf.) exist 
such that lim ¢; = {) and ¢; (¢ = 1,2, --- , ad inf.) is a Bayes solution in the strict 

t=00 
sense. 


(ii) For any decision rule & for which r(F, £0) is bounded there exists another 
decision rule §, such that { is a limit of a sequence of Bayes solutions in the strict 
sense and r(F, $1) S r(F, fo) for all F in Q. 

Proor: According to Theorem 3.5, for each 7 there exists a decision rule 
¢: (@@ = 1,2, --- , ad inf.) such that ¢; is a minimax solution and a Bayes solution 
in the strict sense when © is replaced by Q;. Let {¢:;; be a subsequence of the 
sequence {fi} such that {¢.;} admits a limit fo, ie., lim (i; = >. Because of 


jue 
Lemma 3.3, 
(3.72) lim inf r(F, §:;) 2 r(F, £0). 

jae 
Hence fo is & minimax solution relative to the original space 2 and statement 
(i) is proved. 

To prove (ii), replace W(F, d) by W*(F, d) = W(F, d) — r(F, fo) where &% 
is a decision rule for which ,r(F, £0) is bounded. In proving statement (iii) of 
Theorem 3.5, we have shown that there exists a decision rule {1;(¢ = 1, 2,---, 
ad inf.) such that 1; is a minimax solution and a Bayes solution in the strict sense 
when 2 is replaced by 2; and W(F, d) by W*(F, d). Clearly, ¢; remains to be a 
Bayes solution in the strict sense also relative to 2 and W(F,d). Since (i is a 
minimax solution relative to 2; and W*(F, d), we have 


(3.73) r(F, fi) S r(F, fo) for all F in Q;. 


Let {f1:,} be a convergent subsequence of {¢;} and let lim &;; = ¢;. Then, 
7 j 5 t j i 


j=0 


because of Lemma 3.3, we have 
r(F, &:) & r(F, fo) for all F in Q. 


Since {1 is a limit of a sequence of Bayes solutions in the strict sense, statement 
(ii) is proved. 

Addition at proof reading. After this paper was sent to the printer the author 
found that © is always separable (in the sense of the convergence definition in 
Condition 3.5) and, therefore, Condition 3.5 is unnecessary. A proof of the 
separability of 2 will appear in a forthcoming publication of the author. 

The boundedness of r(F, ¢;) is not necessary for the validity of Lemma 3.3. 
Let lim ¢; = ¢o and suppose that for some F, say Fo, 7(Fo, £:) is not bounded 


1=00 


in i. If lim inf r(Fo, ¢;) = ©, Lemma 3.3 obviously holds for F = Fy. If 


1=28 


y—=_— «(6 


-™- 


a @- 


rar YY «w 


if 


STATISTICAL DECISION FUNCTIONS 205 


lim inf r(Fyo, ¢) = g < ©, let {7;} be a subsequence of {7} such that 


1==00 


lim r(Fo, €:;) = g. Since r(Fo, §:;) is a bounded function of 7, Lemma 3.3 is 


jaw 
applicable and we obtain g = r(Fo, fo). In asimilar way, one can see that also 
Lemma 2.4 remains valid without assuming the boundedness of r(F, 7;). 
Although not stated explicitly, several functions considered in this paper are 
assumed to be measurable with respect to certain additive classes of subsets. 
In the continuous case, for example, the precise measurability assumptions may 
be stated as follows: Let B be the class of all Borel subsets of the infinite di- 
mensional sample space M. Let H be the smallest additive class of subsets of 
Q which contains any subset of 2 which is open in the sense of at least one of 
the convergence definitions considered in this paper. Let T be the smallest 
additive class of subsets of D which contains all open subsets of D (in the sense 
of the metric 6(d,, d2, 2)). By the symbolic product H XK T we mean the 
smallest additive class of subsets of the Cartesian product 2 X D which con- 
tains the Cartesian product of any member of H by any member of 7. The 
symbolic product H X B is similarly defined. It is assumed that: (1) W(F, d) 
is measurable (H X T); (2) pm (x!,-+-, 2™|F) is measurable (B X H); (8) 
521,..2°(D*) is measurable (B) for any member D* of T; (4) z,-(z', --- , 27) and 
cr(z', --- , 27) are measurable (B). These assumptions are sufficient to insure 
the measurability (H) of r(F, ¢) for any ¢. 


REFERENCES 


[1] J. v. NEUMANN AND OsKAR MorGANSTEIN, Theory ef Games and Eonomic Behavior, 
Princeton University Press, 1944. 

]2] A. Wap, ‘“‘Generalizatoin of a theorem by v. Neumann concerning zero sum two-person 
games,” Annals of Mathematics, Vol. 46 (April, 1945). 

[3] A. Waxp, ‘‘Foundations of a general theory of sequential decision functions,’ Eco- 
nometrica, Vol. 15 (October, 1947). 








THE MULTIPLICATIVE PROCESS 


By RicHarp OTrTer! 2 


University of Notre Dame 


1. Introduction and summary. The multiplicative process is usually defined 
by the sequence of random variables X), X;,--- whose distributions are 
specified as follows: P(X) = 1) = 1, yi P(X; = v) = 1, and if X, = 0 then 
P(Xnw. = 0) = 1, whereas if X, is a positive integer then X,,4: is distributed 
as the sum of X,, independent random variables each with the distribution of X, . 
The variable X,, is interpreted as the number of “particles” in the nth generation, 
and the index n as a discrete time parameter. This has been the method of 
approach in previous studies of the process [1, 2, 3, 4, 5]. The multiplicative 
process has various applications, notably in the study of population growth, the 
spread of epidemics or rumors, and the nuclear chain reaction. The closely 
related “birth and death” process was recently studied by Kendall [6]. 

Whenever one studies the probability theory of a particular system there 
seem to be definite conceptual advantages in defining explicitly the set J of 
elementary events, the additive class Mt of subsets of S, called events, and the 
probability measure P for the events of I. Now an elementary event of this 
process can be represented by a rooted tree where the original particle is repre- 
sented by the root vertex and where the particles of the nth generation are 
represented by the vertices n segments removed from the root. The tree will be 
finite or infinite according to whether a finite or an infinite number of particles 
are involved in the elementary event. Thus, the set of trees is the natural 
choice for S. The first part of this paper is devoted to a more precise description 
of JS, Mand P. We shall then see easily that X,(t), the number of vertices n 
segments removed from the root of te J, ie. the number of particles in the nth 
generation, has the distribution defined in the preceding paragraph. Since the 
time does not appear in our description of Swe fetter ourselves somewhat if we 
interpret n as a discrete time parameter. Thus, we have already reaped some 
harvest from considering the process from the point of view of %. Another 
advantage is that we are led in a natural way to study the distribution of other 
structural features of the trees, e.g. the total number of vertices, or the number of 
vertices with k outgoing segments. 

The chief results of this paper are as follows. The recursion formula for the 
probability P, that a tree have n vertices n = 1, 2, --- is obtained as well as an 
asymptotic estimate of P,, valid for large n. The distributions of the number of 
branches at the root in a finite tree, an infinite tree, or in a tree with n vertices 
are obtained and the asymptotic distribution of the latter as n — ©. The 





1 Research under an Office of Naval Research contract. 
2 The author wishes to express his gratitude to Professor E. Artin of Princeton University 
for the suggestion of this problem and his encouragement towards its solution. 


206 


ae 


ana nix oo ae ar 


THE MULTIPLICATIVE PROCESS 207 


distribution of the fraction of vertices with /: outgoing segments in the finite 
trees, in the trees with n vertices, and the asymptotic distribution of the latter 
asn-— > © are also found. Finally, an estimate is obtained for the probability 
that a tree be finite in case this probability is near 1, a result which was previously 
obtained by Kolmogoroff [7]. 


2. The space of trees. We shall use the notation {a}, {a:, a, --- Qn}, 
fa;} jer, and fa; R} ;.. to denote the sets which consist of respectively the single 
element a, the elements a, d2,--- G,, all a; with je J, and all a; with the 
property R andj «J. We denote the union of two sets A and B by A + B, their 
intersection by 1B, and the cartesian product of n identical factors each of 
which is A by A‘”’. 

Let I denote the set of positive integers. We assume given for each nel a 
countable set U,, of objects w;,i,...i, called vertices, i.e. 


U, = { Uisig-esig} Cigetgeeciqy ed) ° 


Let uo be a vertex distinct from all the other vertices and let U = {wm} + =U, 
be the collection of all the vertices. We shall interpret uw as the original parent 
particle and the vertex t152 , for example, as the second son of the fifth son of the 
first son of the original particle. If s isa subset of U,s C U,and if 1, %, +++ ingm 
are such that Uijig...in » Wigig--indngs 9 °° * Uizig---iningi---ingm CaCh belong to s then 
this set of vertices is called a path from U;,i9...i, tO Uizig---ingm NS and m > O is the 
length of the path or the distance from Uj,i,...i, tO Uizig-. If m = 1 we call 
the path a segment, for short. 

For the sake of convenience let us agree to put Uj,i,...i0 = UWizig---ig » (M > 1) 
then we define W(s, uw), for u ¢€ s C U, to be the number of segments from uw in s, 
and we call W(s, u) the type of the vertex u in s. If t is a subset of U, then we 
call ¢ a tree if and only if 


*tn4m * 


(1) W(t,u) < © for wet 
and 
(2) Uisig--4,€¢ implies j,;,...,_.6€¢ for v = 0,1,--- %. 


Let S be the set of all trees. The condition (2) clearly implies that for each 
te Swe have w et and that there is a unique path from w to any other vertex 
of t. Hence, whenever a path exists between any two vertices of ¢ it isunique. 
We call up the root of t. If for uete Swe have W(t, vu) = 0 then wu is called an 
endpoint of t, and the vertices of ¢ which are not endpoints are called inner 
vertices. (It is to be noted that the objects we call trees here are rooted trees 
in the sense of Cayley but our trees have their vertices numbered as well. 
Usually one would identify the trees {w, wu, Ww, Uu} and {w, Wm, Ue, Un}, 
but we do not wish to do so because for us it is distinctly different whether the 
grandson is sired by the first son or by the second son.) 

For u et € J we define the branch of t at u to be the set of all vertices belonging 








208 RICHARD OTTER 


to any path from uin?. Our convention of admitting paths of length 0 implies 
that ue b(t, uw). In fact, if W(t, w) = 0 then b(t, u) = {u}. If t’ is a tree such 
that t’ Ct then we call ¢t an extension of t’, denoted t > t’ ort’ < t,if W(t’, u) > 0 
implies W(t’, u) = W(t, u). Thust > ?t’ is equivalent tot D ¢’ and 


t=t'+)>) dit, u) 


where u runs through all the endpoints of t’. The extension relation imposes a 
partial ordering upon J. 

The extension ¢ of t’ is interpreted as a possible future aspect of a family tree 
when its structure at present is given by ?’, all present members of the family 
who have progeny being regarded as sterile. 

If uw = wj,,i....i;, then the mapping ¢ defined for the vertices of b(t, wu) by 
putting 


ile telecstecind = Uingi-+-ingm 


maps b(t, w) one to one onto a tree g(b(t, w)) in such a fashion that if {um , v2} isa 
segment from 7; to v2 in b(t, u) then {y(v1), ¢(v2)} is a segment from ¢(v1) to 
¢(ve) in g(b(t, w)). We call the mapping ¢ a homeomorphism and we say that 
b(t, uw) is homeomorphic to ¢(b(t, u)). 

If a tree contains a finite number of vertices then it is called a finzte tree; 
otherwise it is an infinite tree. Let denote the set of all finite trees and J the 
set of all infinite trees, and let “K denote the set of non-negative integers. For 
each k eK we define Y;(t) for t e Ff to be the number of vertices of type k in t. 
When it is clear to which tree ¢ we refer we shall usually abbreviate Y(t) by m, 
and we agree not to use the letter m with any other connotation. For each 
T ¢ F let ex(T), e2(T), --- e@m(T) denote its m endpoints. We then define for 
Te Fand « = (ky, ke, +++ km) eK ™ 


[T, x] = {t|t > T, Wit, e(T)) = kj, i = 1,2, ++: me SH, 


and we call [T, x] a neighborhood. For each t e [T, x] we say [T, x] is a neighborhood 
of t. Then it is easy to show that J is a topological space where the neighborhoods 
defined above form the defining system of neighborhoods {8}. 


3. The measure theory in SJ. ‘ In the following paragraphs an outline of the 
measure theory in J is given which omits proofs for the most part since they are 
easily constructed. The only point of difficulty arises in showing the measure 
function to be completely additive, but here the outline has more detail. 

Let S be the collection of subsets of J such that 0 € S and any other set S 
belongs to S if and only if there is a t eF and a non-void “rectangle set” 
A = A,X Ap X& +++ Am CK”, m = Y)(t), such that 
(3) S= Dit, 4 


KeA 


where the sets 4:, A2,--- Am may be finite or infinite sets of non-negative 


eee 


Le Se Ow 


ee 


<a 


THE MULTIPLICATIVE PROCESS 209 


integers. The collection of neighborhoods which appear as terms in (3), i.e. 
{[t, K]}eca , We call an G-partition of S, and t is called the generator of the S- 
partition. Only a finite number of ©-partitions are possible for an S «SG, 
because only a finite number of trees can possibly be generators and there is 
only one ©-partition per generator. With respect to our partial ordering of the 
trees all possible generators lie between two particular ones. We call the 
smaller of these the irreducible generator and the corresponding G-partition the 
irreducible S-partition of S. Any partition of S into neighborhoods must be a 
subpartition of this irreducible G-partition. The elements of © also display 
two important properties of the rectangles in Euclidean space, namely if S 
S’ e S then 


(4) SS’ «S 

and if S C S’ then there is a finite chain 

(5) S=§CSCc--- 8, = 8S’ 
such that S;, S; — S:i4¢€€ fori = 1,2,--- n. 


A class of sets with the properties (4) and (5) has been called a half-ring by von 
Neumann [9]. 


Let po, ~i, °°: be given non-negative numbers such that 2 p =1. For 
te F let us put 


(6) P(t) = et . 


with the convention 0° = 1. We then define the measure function P for the 
sets in S by 


P(0) =0 
(7) P(t, k}) ~ (i Pe) P(t), where k = (ky ’ Ig +++ Km) € _ 
yp=l 
P(S) = > P(t, «]), where {[t, «]} xea is the irreducible S-partition of S. 


P is evidently non-negative. Letting ¢ be the tree with one vertex and putting 
A = Kgives P(S) = 1. It is easy to see that P is completely additive for the 
G-partitions of a neighborhood, but this implies P is completely additive for the 
©S-partitions of an arbitrary element S of S. In order to show that P is com- 
pletely additive for any partition of S into elements of S, it is necessary and 
sufficient to show this for an arbitrary partition of a neighborhood into neighbor- 
hoods. One may reach finer and finer partitions of a given neighborhood N by 
replacing a neighborhood in any one partition by an G-partition of the neighbor- 
hood, and repeating the process. The sum of the measures of the sets in the 
partition is invariant under such a replacement. On the other hand it can be 
shown that all possible partitions of N into neighborhoods may be reached in 
this way. More precisely, let N = {N,} 7 be a partition of a neighborhood N 








210 RICHARD OTTER 
into neighborhoods N;. We call N reduced if whenever a subset of N is an 
S-partition of a neighborhood M C N then the partition consists of M itself, i.e. 
it is the irreducible ©-partition of M. Then we have the following theorem: 

TueoreM 1. Jf Nisa reduced partition of a neighborhood N into neighborhoods 
then N = {N}. 

The proof is indirect and proceeds by constructing a decreasing sequence of 
neighborhoods contained in N whose limit is not void and yet has nothing in 
common with any N;, but this is a contradiction. 

Let § consist of all those sets which may be formed by finite unions of disjoint 
elements of a half-ring S, then § is a field of sets. If P is a completely additive 
measure on © then its natural extension P; is completely additive on § [9]. 
Kolmogoroff [10] has shown that the completely additive measure P; may 
always be extended to a completely additive measure P2 on the Borel field M, 
i.e. the smallest additive class of sets containing §. Since P.(S) = 1, Peis a 
probability measure. For simplicity we put P2 = P. Let us also agree that if 
M is the set of all trees with the property R we may write P(R) instead of P(M). 
If N isaset with P(N) > 0 then P(M/N) shall denote the conditional probability 
of M, given N, i.e. P(M/N) = (P(N))'P(MN). 


4. Independence of the branches. In the multiplicative process the events 
occurring in one branch of a tree are independent of those in a second branch 
disjoint with the first and it is for this reason that the process is relatively simple 
to analyze. In this section we shall try to expose the character of this 
independence. 

For T ¢ F, let &, be the set of all extensions of 7, then 


& = Li IT, 4, 
—_ 
whence by (6) and (7) P(&r) = P(T). The following lemma is then easily 
established. 


Lemma l. Jf P(&r) > 0 then W(t, e:(T)), i = 1, 2, --- m, under the condition 
t e &; , are independent-random variables each with the distribution, 


(8) P(W(t, e:(T)) = k/Gr) =m k=0,1,2,---. 
In the particular case where T = {uw} we have S&; = J and we put 


W(t) = W(t, w) for short. Thus W(é) tells what type of vertex the root of ¢ is 
and (8) becomes 
P(W =k) = p, Ek = 0, 1,2, ---, 
For t e § and n = O, 1, 2, --- let X,(t) be the number of vertices of 
t at distance n from its root. Then Xo(t) = land X,(t) = W(t). If n,r are posi- 
tive integers then there is at least one JT ¢ ‘£ which has r of its endpoints, say 
e:,(T), e:(T), --+ e:,(T), at distance n from the root and which also satisfies 


X,u(T) = 0. Put 
Sr" = {t| WU, e(T)) = 0,7 4, , +++ tr, be Sr}. 


a 


————————————— 


THE MULTIPLICATIVE PROCESS 211 


Evidently for t ¢ 67°"? 


Xo) =D Wet el), 


and a proof similar to that of lemma 1 gives 

Lemma 2. If P(&?'*") > O then Xy4:(t), under the condition te 62°"*, 
is the sum of r independent random variables each with the distribution of X, . 

By (6) and (7) for te ON C Fr 


P(t) = I pr, 


which depends only upon the type of each vertex as it occurs in t. For those 
vertices which are inner vertices of 7, Y,(t) is constant. Any other vertex 
belongs to one and only one of b(t, e(T)), b(t, e(T)), --- b(t, em(T)) and its 
type in ¢ is, of course, the same as its type in the branch to which it belongs. 
Furthermore, each branch is homeomorphic to just one tree in F, 


b(t, e:(T)) oot, , i= 1, 2, ooo M. 
Since the type of a vertex is preserved under homeomorphism we have 
P(t) = P(&r)P(h)P(b) +++ P(tm). 


If, as ¢ runs through ON, (4, 2, +++ tm) runs through Ol XK Mk XK +++ Mn, 
we obtain 


(10) P(N) = P(Sr)P(MG)P(ML) +++ P(OMn). 


Let us hereafter put p = P(F). In the particular case of (10) where Ol = F&r 
we clearly have Ol; = F,7 = 1, 2, --- m, hence 


(11) P(S&r) = P(&r)-p”. 


If we define T,, v = 0, 1, 2, --- , to be the tree with v + 1 vertices which has 
W(T,) = v then 


FT = {up} + 2d Sry; 


(12) a 
F = {uo} + LU FErr, 

y=l1 

where 
&7, Sr; = &r; {vw} = 0, t¥ 7; 
P(&r,) = pv, y= 1,2,--- 

From (11) and (12) we get 
(13) > pp’ = p. 








212 RICHARD OTTER 


For t e F&r let Z(b(t, e:(T))) be the number of vertices in the branch of ¢ at 
e:i(T). In the particular case where T = {w} we have b(t, wm) = t and Z(t) | 
is the number of vertices of ¢t. If now 


Sn = {t|Z(0) = n, te FS}; n=2z1,2,--- 
P, = P(S,), 
then by putting ON = ON7"”" where 
Me" = {t| Z(b(t, e(T))) = n:, i 


(14) 
OM; =F,,, «t= 1,2,--:m, 


I 
_ 


‘c~ 
2,--- m, te SGr}, 


we may apply (10), which gives 

(15) P(teF&r, Z(b(t, e(T))) = ni, it = 1, 2, --- m) = P(Gr)Py,P., --- P 

If p > 0 we may multiply and divide the right hand member of (15) by p” 

which leads us to the following lemma: 
Lemma 3. If P(F&r) > 0, then Z(b(t, e:(T))), «7 = 1, 2, --- m, under the 


condition t e F&r , are m independent random variables each with the distribution 
of Z(t), giventeF. | 


Im? 


5. The distribution of Z(t). Let f(w) be the generating function for the dis- | 
tribution of W, i.e. 


(16) fw) = 2 pow’ 

where w is a complex variable. If one is interested in studying the sequence 
Xo, Xi, °:: then one should define another sequence of functions fo, fi, --: 
where fo(w) = w and frii(w) = f(f.(w)) for n = 0, 1, 2, ---. By computing 


formally the expansion of f,(w) around w = 0 it is not difficult to show that 
fn(w) is the generating function for X, , i.e. f,(w) = Zz. P(X, = v)w’ which is 
v=0 


the starting point for the previous investigations of the multiplicative process. 
But since we shall be mainly interested in the distribution of Z we define P(z) 
to be the corresponding generating function, i.e. 


(17) P(z) = x P,2”. 
Let p and a@ be the radii of convergence of the power series in the right members 
of (16) and (17) respectively. Since f(1) = 1 and P(1) = P(F) < 1 we know 
p,a > 1 hence f(w) and P(z) are analytic in | w| < pand|z| < a@ respectively. 
_ The relation between the distribution of W and that of Z is put in evidence by 
the following theorem: 

THEOREM 2. Let 


S(z, w) = 2f(w) — U, 


ut 
t) 


WwW 


y: 


THE MULTIPLICATIVE PROCESS 213 


then w = P(z) is the unique analytic solution of 
(18) G(z, w) = 0 


in a certain neighborhood of (0, 0). 

Proor. Since (2) is analytic at 0 and P(0) = 0 it suffices to show that if we 
substitute formally >> P,,Z" for w in z>. p,w” the coefficient of z” is uniquely 
determined and is P,, . 


(19) 2 (Pe)! = = poz + Et tet Pe Par Png: Pa.) 2". 

If in (14) we put 7 = T,, where TJ, was defined just before (12), then 
m = Y,(T,) = v. Let us require in addition that the total number of vertices 
in the branches be n — 1, i.e. m + m+ +--+ n, = n — 1, then 


~~ 


(20) F;=> DY On”, n=2,3,--- 


v=1 Dnj=n—-1 


where 
5 ni “nh; ™ ; 
9) ee _ re 0, 


unless i = j and m = m, 2 = m,°:+:n; = m;. By applying P to (20) and 
using (15) we get the coefficient of Z” in (19) form > 2. This together with the 
obvious fact that P: = p) completes the proof. 

It is worthwhile noticing that by means of the formula of Burman and La- 
grange [11] we can solve the recursion formula for P, in terms of po, pi, °°: 
namely 


- q*-) ? _ (n — = 
(1) Ph= a 1 |= 4 (sw) |, - i, voln! er 
Livj=n— 


’ 


Now if ¢ has n vertices we know from Euler’s characteristic that 
> iY) =n—1. Since P(t) = [J p,;7* we see from (21) that 
(n — 1)! 


’ 
volvyles- 


Vv; = Nn, Ljv; =n-— % 
is the number of trees in F, for which Yo(t) = », Yi(t) = ,---. 
Ev idently w = ¥(z) remains a solution of (18) for all z such that | z| < a, 
|w| <p. In case p = 0 the constant 0 solves (18). Hence f(z) = 0 for all 
2 and so (1) = p = 0. Conversely, if p = 0 then Pi; = po = O which gives 
CoroLuaRy 1. p = O7f and only if po = 0. 
Since we wish to investigate the distribution in F we shall henceforth assume 
Po * 0. 
Any non-constant function g(z) which has a power series development pos- 
sessing non-negative coefficients g(z) = >. a,2’, a, > 0 with a positive radius of 


convergence FR has two properties that are important for us: 


(22 g(z) has a singularity at R. 








214 RICHARD OTTER 


(23) If z a,R” converges then >» a,z9 converges absolutely and uniformly 


for | 2.| = R, and so the series defines a continuous function g(z) there. We 
-o lim ak se ae ies cc , bs 3° oc Atal —— 
have aa! g(z) = >> a,2) as long as the path of approach to % lies in |z| < 


R. On the other hand, if as z approaches R through real values below R> 
z — R-, the limit of g(z) exists then >> a,R’ converges. So if we put g(R) = 
iim g(z) = >. a/R’ then the meaning is unique even allowing © as a value. 


Returning to P(z), if for |z| < awe have |w| < p where w = Pz), then 


ap ae De 
(24) 2 “= P~"(w), 
which shows the mapping is schlicht in such a domain and that the image domain 
cannot contain zeros of f(w). Because of (23) and the fact that 9(1) is finite 
even if a = 1 we see that the mapping is certainly one to one for | z| < 1. 

Corouiary 2. p is the smallest root of f(w) = wind < w <1. 

Proor. (13) shows p is a root in the interval. If for0 < wo < p we have 
f(wo) = wo then by (24) P (wo) = 1. 

The following corollary is the well known criterion for extinction 

Coro.uary 3. p = 1 if and only if f’(1) < 1. 

ProoFr. p = 1, po > O, and the convexity of f(w) in 0 < w < 1 guarantee 
that (f(w) — 1)/(w — 1) is bounded by 1 and is monotonic increasing with w. 
Hence f’(1) exists and is < 1. 

Conversely, if f’(1) < 1 then either f’(w) is constant (= », <1)n0d<w<1l 
or else it is strictly increasing with w and in either case f’(w) < 1. The 
mean value theorem gives f(w) > win 0 < w < 1, hence p = 1. 

Putting a = Y(a) we have the following lemma: 

LEMMA 4. a < p. 

Proor. We already know that 9(z) has a unique analytic inverse given by 
(24) for | P(z) | < p, but on the other hand ¥’(z) # 0 for 0 < z < a@ so this 
inverse is analytic for 0 < w <a. If we had a > p we could continue f(w) 
analytically by means of (24) along the real axis past its singularity at p, but 
this is impossible. 

CoROLLARY. p = 1 if and only if a > 1. 

Proor. The necessity follows from the monotone behavior of Y(z) for 
0<z<a. Conversely, ifa > 1 thenz = P"(1) = 1. 

THEOREM 3. If po + ti ¥ 1, then 


(25) a and a are finite; 
(26) f(a) = a/a; 
(27) f'(a) < 1/a where the strict inequality can hold only if a = p. 


Proor. Let r > 2 be such that p, ¥ 0, then for 0 < z < a, we get from the 


ee 


Aw 


THE MULTIPLICATIVE PROCESS 215 


functional equation 


p(P(2))" — P(2) < 0; 


1 1/(r—1) 
0 < V{z) < (+) ‘ 
2Dr 


By letting z — a— we see a is finite and P(z) is bounded. Since 9(z) is mono- 
tonic in this region we get a < ~. By letting z > a in G(z, P(z)) we get (26). 
For 0 < z < a, %,,(z, P(z)) = 2f’(P(z)) — 1 is continuous and monotonic in- 
creasing with z and is < O for z near 0. From the general theorem on implicit 
functions we know G,,(z, P(z)) # Ofor | z| < a, so if we let z > a we obtain (27). 

If a = p (27) merely guarantees the finiteness of f’(p) and gives an upper bound. 
One can easily construct an example where 1/a is the least upper bound and 
one where it is not. 

But if a < p then since %(z, w) is analytic at (a, a) and G(a, a) = 0 we obtain 
from the implicit function theorem the strict equality in (27). 

CoroLtuary. [fa = 1 thena = p= 1. 

Proor. By (26) 


(28) f@) =a=F(1)=p<l. 


If a < p then f’(p) = 1 so p = 1 from the convexity of f(w). If a = p then 
a > 1 which when combined with (28) gives a = 1. 


The case where p + p: = 1 escapes Theorem 3 but it is easily examined 
separately, namely 


fw) =pt+pw, px 0, 


= n—1 on Z 
Pe) = Li popi*z” = 5 


Hence p = 1,a = 1/ppanda = p= o, 

For the practical applications of the theory it is valuable to know some 
conditions which guarantee a < p, and thus strict equality in (27). From the 
foregoing analysis it is evident that one such condition is p = ©, i.e. f(w) is an 
entire function, and another is f’(1) > 1. If one has enough information about 
f(w) to plot its graph for real positive w then the line through the origin tangent 


to f(w) in the first quadrant touches the curve at the point (a, a/a) from which 
we determine both a and a. 


6. Asymptotic properties of the distributions. If we examine the terms of 
the sequence po , pi, -*: We may find that the indices of the non-zero terms are 
all multiples of some common integer larger than 1. In this case we should 
expect to have P,, = 0 with the same sort of regularity. So let us define q to 
be the largest integer such that p, ~ 0 implies v is a multiple of g. Clearly we 
have g > 1 and q = 1 means there is no integer other than 1 which divides the 
indices of all the non-zero p,. Of course, p: ~ 0 impliesqg = 1. The following 
theorem establishes an asymptotic estimate for P, valid for large n, provided 








216 RICHARD OTTER 


nm — 1 is a multiple of g, and incidentally shows that P, = 0, if n — 1 is nota 
multiple of q. 
THEOREM 4. If a < p then 


4 
(29) P, _|t Les ) an? + Olan), mn =1 (mod q); 
\0 ? n x 1 (mod q); 
i.e. for large n = 1 (mod q) 


P,~ (s--Fras) — 
0 Tire") * * ° 


Proor. Let us put 6 = 27/q, then for |w| <a, 


oo 


|F0)| = 135 maato”*| < SS paolo = fiw), 
| k=0 } 


and the equality evidently holds if and only if arg w is an integral multiple of 6 


Furthermore, if w is such that | f(w) | = f(|w|) and we put 2 = P"(w) then 
w = P(w/f(w)) so we get 


\P(z)| =P eI) =X (, | w | ) = 9(|z|), 
_—_ (i w) ifGoy |) = N28 


hence P,, = 0, if n # 1 (mod gq). 
For |z| = a and w = Y(z) the point (z, w) satisfies (18) by (23). If we put 


oi ivO 
& =ae , 


w, = ae”, v=0,1,---q-1, 
then w, = P(z,) and 
G.(z,w,) = zf'(w) — 1 = af’(a) —1=0 
so that 20, 21, °°* 29-1 are certainly singularities of P(z). But f(w) is analytic 
at w, and f(w,) = a/a ¥ 0, so the solution of (18) for 2, 








Ww 
z=9'w) => > 
) = Few), 
is analytic at w,. Furthermore’ 
tans _1- at 
£ Fu) « SE ae, 
dw *  f(w,) 
@ gt fw) _ _ a f(a) 
os (w,) aa am , ee ~ 0, 
dw’ f(w,) Wp 
which shows that Y(z) has a branch point of order 1 at each z,, i.e. P(z) is an 
analytic function of (z — z,)"” in the neighborhood of (z, , w,), vy = 0,1, ---q—1. 
For | z| = a, w = P(z) but z ¥ 2, we obtain 


| Su(z,w) | > 1 — a@| fw) | > 1 — af(|w|) > 1 — of'(a) = 0, 


Te 


THE MULTIPLICATIVE PROCESS 217 


hence ¥(z) is an analytic function of z in a certain neighborhood of such a 
pair (z, w). 

By analytic continuation we find a circle of radius 8B > a such that P(z) is an 
analytic function of (¢ — z,)"” for |z| < 8. If we make radial cuts in this 
circle running outward from each z, then in the resulting domain D each of the 
functions (2 — z,)"” is an analytic function of z hence so is P(z). 

Let T be the path consisting of the boundary of D oriented in the posi- 
tive sense, let y be the part of I lying in the sector —z/q < arg z < /q, 
and let y’ be that part of y leading from 8 to a along the lower lip of the cut at 
a, thence along the upper lip back to 6. Since V(z) satisfies the relation 








Pez) = e'’P(z) for v = 0,1, --- g — 1, we see from Cauchy’s formula that 
F(z) F(z) 
P. = — 2 ‘d = > 
22 Jr 2"t ' =< y ot - 
where 


A= - eo") —0, n#1(modg); 
= 4, nm = 1 (mod q). 
Restricting ourselves to n = 1 (mod q) we put 
P(z) = a+ B(z — a)? + cle — a) + (2 — a)?” Q(z), 
where 2(z) is analytic in D. Then P, = B + C, where 
‘oo a+ be -—a)'+ee—-a), 


Dri gntl @, 
- ‘f s- a)* Q(z) 2, 
Qari grt 
We find 
bg f (2 — a) sah ta i a sale sti 
a« — a dz + O(8") = ibgVa(—1) e a” + 0(8"); 


je = o(f, 4Zpat lael) = of] Sar ae) = 0 (a |(%))). 


The constant b is determined from the equations 
w—a = biz — a)? + : 
2 
z-a= 2 I'"®@ Wy _ gt + 
2a 


Using the fact that 


(v ig = (4an*)*? + O(n”), 


(Cr) |= 20% 


we finally obtain (29) as desired. 








218 . RICHARD OTTER 


Thus P, approaches zero a little faster than exponentially with n regardless 
of whether p = 1 or p < 1, except for the special case when a = 1. In this 
case it is interesting that, according to the corollary to lemma 4, p = 1. 

The case where g ~ 1 is of no practical importance since one can always bring 
q back to 1 by making a very small decrease in one of the non-zero p, and in- 
creasing p, by the same amount. This can clearly be done so that none of the 
important characteristics of f(w) is changed appreciably. 


7. The limiting distributions of W(t) and n “Y,(t) for te ,. Let us mo- 
mentarily drop the condition py # 0. The characteristic function of W is 


(30) [ e’" dP = fle”), 
3 


so that for the rth moment of W we have 


bo 


(31) E(W') = a fle”) i r=0,1, 
For the first and second moments we obtain 

EW) =f’), 

E(W*) = f’(1) + f’"(1), 


which shows that the criterion for extinction (Corollary 3 to Theorem 2) may be 
stated as follows: the multiplicative process is almost certain to expire if and 
only if E(W) < 1. From (30) we see that all the moments of W will be finite 
as soon as p > 1; but if p = 1 no general statement can be made, except in case 
a = 1 also, for indeed a = 1 implies a = 1 so by (31) and (27) E(W) = f’(1) < 1. 
We now reassume p ~ 0. Since the variables Z, Yo , Y1, --- are restricted to 
t e Fit is convenient to see what happens to W in F. If we define g(w) = p ‘f(pw) 
then (13) shows g(w) and g(e”) are the generating function and characteristic 
function respectively for W, given te ‘fF. Thus we see immediately that the 
first moment of W, given F, is always < 1, and all its moments are finite if p < 1. 
In case 0 < p < 1 we may also introduce h(w) defined by 


(32) f(w) = pg(w) + (1 — p)h(w), 


then h(w) is obviously the generating function of W, given J. Here the rth 
moment is finite whenever the rth moment of W is finite. (32) gives 


; 1— p* 
PW =k/9)=m—— -, k=1,2,-:- 
: 2 
It would be interesting to be able to compare this with the corresponding thing 
for large finite trees and in this connection we have the following theorem: 
THeorEeM 5. Ifa < pandq = 1, 





lim P(W = k/F,) = akpa®, k=1,2 


= 
n~>2 


<n 


EE SS... 


THE MULTIPLICATIVE PROCESS 219 
° 6 °. ° 
Proor. By expanding zf(e"P(z)) in powers of z we obtain 


(33) f(e"F(@)) = De dn(6)2", 
where 
$n(0) = [ "dP => DY ec” p,Ps, Pas-*Pr,, 
Pa yal Dnj=n—1 
so that if P, + 0 then P;’¢,(@) is the characteristic function of W, given Fy. 
From (33) we get 
J. fle F(e)) dz. 


2m Jr 2 


$n(6) — 





Since a < p we may expand f(e"P(z)) about the point Y(z) = a and integrate 
as in the proof of theorem 4, thus 


P. n—1 16 


Pz ¢n(8) = ee f'(ae”) + €,(8). 





Since ¢,(0) ~Oasn— o~, 
lim P7'¢n(0) = ae“f’(ae”), 
the limit function obviously being the characteristic function for the distribution 


whose generating function is awf’(aw), from which the theorem follows directly. 
Now 9#(z)/p is the generating function for Z, given §, and the function solves 


(34) zg(w) —w =0 
for |z| <a. We find for the rth moment of Z 
a” P(e%*) 


E ii <7 = sc CC ’ — ’ ’ eee ; 
(2'/F) aio pes r=0,1 
hence all the moments are finite as soon asa > 1. Since by (34) 


dz i-aw) 2#l—-aw)’ "-  »~ 
we obtain for the first moment 


a . F(1) _ 1 oo 1 
BGS) = "T= 90) 1-7" 
In a similar way one can express any moment of Z, provided it is finite, in terms 
of f’(p), f’(p), ete. If a = 1 we see from the corollary to theorem 3 that even 
the first moment of Z is infinite, except for the special case where p = 1 and 
f'da) <1. 
The characteristic function of Y; , given &, is 


¥i(0) = 1/p | e*"* dP = 1/p X Yunl6), 








a 


220 RICHARD OTTER 


where by (21) 


, oo 
Vin) = I. ee gP= > (x — 1)! eH ng” my"! «: 
un 


Zyj=n v9 ly! aie 
Livj=n— 1 





Thus, if P, ~ 0, Pa Wen(6) is the characteristic function of Y;, given f,. If 
= Othen y,(6) = 1. If px ¥ O put py = e™ then 


(r) 





(35) 7 P@ = 7 Vin (0)z", 
hence 

1 oor) a es 

— 7; P(1) = E(Y;/S), 


which shows that all moments of Y; are finite ifa > 1. Let us put w = SFUz), 
for short, then, by (18), 
(36) dw = Aceon w" 2 pr we 1P'(z) , 

Og 1 — zf’ i —2f'(w) 


which gives for the first moment of Y; , 


-1 
E(Y;/F) = oP _* pip’ E(Z/F), 


which is to be expected since p,p** plays the same role in F that p, plays in TS. 
We may also expect that for t € Ff, , n 'Y, should be closely related to p,. This 
question is settled by the following theorem: 

THEOREM 6. Jf a < pandq = 1 then for x real 


1, ifx > apa‘; 


lim P(n“"Y, < 2/F,) = 


no -1 


| 0, ifx < apa‘. 
Proor. We intend to estimate the rth moment of n~Y, for te‘, and n 


very large from (35) by means of the contour integral 
y p £ 


(37) Bea VyFy = — [i a = 
n k/« ad -— Orin’ P, r Ogr z grtis 


So let us put 





. (2) oirts) 
w=gJg (z), Uw, = agraz’ W, r,s = 0, | , 
then by (36) w, = 2’pw*'w and by Leibnitz formula, provided k + 0, 
r— 1)! 
(38) Wr z Dk >» - be I Wy, Wee °° * Wy, W e. 
Dyy—r—1 Volz! --> vz! 


vyy2zO0 


cet 


‘ OT 


7 


THE MULTIPLICATIVE PROCESS 221 


The principal contribution to the integral in (37) will come from the term of 
(38) which has the largest size for z near a. If we put ¢ = (z — a)” then w is 
regular at ¢ = 0 and so is the constant p,. Let’s assume that w, has a pole of 
order 2v — 1 at ¢ = Ofor v = 1, 2, --- r — 1, which is clearly true for v = 1. 


Then if s is the number of 1 , v2, --+ 1 which are = 0, the order of the pole 
of the general term of (38) at ¢ = 0 is 

k—1 

D. (2% — 1) +s +2n +1 = Ar — ») — (ks), 

i=l 
which has the maximum value 27 — 1 if and only if » = 4 = --+ 44 =0, 
vy. =r—1. Hence 
(39) w, = J pw wert PR), 


where ‘R(¢) is a regular function of ¢ at zero. For k = 0 the formula (38) is not 
correct but it is easy to see directly that (39) is correct for k = 0. If we derive 
(39) with respect to z and put r — 1 for r we obtain 


Wer = Z pw wre + FP" Ral(s), 
hence 
w, = (Z pw" Jw? + P*R(5). 
Substituting in (37) and estimating in a manner similar to that employed 


previously we obtain 


) de 











k—lyr y(r) _ 1 2 1/2 
E(n" Yi/F,) = (pa) f PC ete (2 — a) "Ri((z — a)” 
Tr 


2rin’p, Jrz*-*t 2rin’pr2™*) 


r Par (n — r)(n — r — 1) --- (n — 2r + 1) 
Fe n 


= (p,a*") + O(n”), 


and finally 


lim E(nYi/F,) = (appa y’. 

The limit of the rth moment is itself the rth moment of the distribution on the 
real line which has all its mass at the point apa“ ’. Since this distribution is 
uniquely determined by its moments, a well known theorem [7] enables us to 
conclude that our sequence of distributions has this distribution as limit and 
this is equivalent to what is claimed by the theorem. 

It is important to notice that if we put the mass ap,a*’ at the point k this 
determines a distribution on the real line because of (26). 


8. The estimation of p. If we wish to estimate p when we know p ¥ 0, we 
may obtain an estimate from the knowledge of f(w) in 0 < w < 1, using the 
method of iteration. That is we choose a function G(w) such that G(p) = p 
and | G(w) — p| <|w — p|for0 <w <1. Then if for any w in the open 














222 RICHARD OTTER 


interval we compute successively w; , w2, --- , Where zy. = G(wa) for n > 0, 
we are sure that w, converges exponentially to pas n — . 

Obviously f(w) itself has the properties of G(w) but we achieve faster con- 
vergence towards p using Newton’s method, that is if we put 

fw) = fw) — w, 
(40) 7 filw) 
G(w) = w -— “7. 
filw) 

If for some reason we expect p to be close to 1 then it is better to put 


_ fiw) - w 








fo(w) = ak ; | 
and use f2(w) in (40) instead of fi(w), for then we may choose wy = 1. 
Let us put f’1) = 1 + ¢, «€ > 0 then | 
fl +h) = —— — i Pa, h— 0; | 
: 1+h+hk)—1 22 =") | 
fa +h) = fl + a+ h— 1 _ 
ei ( k(h + &) kh 
, i + saat ny 
f.(1) = lim = F 2h) al + h) +1) _ fd 
h-0 2h? 2 
Hence 
2 
(41) p=w=1— 


r"()” 


This result was previously established by Kolmogoroff [7]. 
The following two simple examples display the results of the general theory. 
EXAMPLE 1. We take f(w) = po + pw + pow’ where p + pi + me = 1 
and pp, p2 > 0. We have p = o. From the equations (26) and (27), 


’ 


f@ =pt+pat pa” = 


° 


R18 


1 
f(a) = m+ 2pa = +, 


we obtain easily 


a=JSppr, «@ =n +27 pp, 


and it is evident that a > 1 is equivalent to po > pe is equivalent to f’(1) = 
m1 +2p.<1. Now 





G(z, w) = zp) + (zp. — 1)w + zpow", 


THE MULTIPLICATIVE PROCESS 223 


hence 


- ‘ = oe aa oa —_ 42 pope 
(42) F(z) a a epi - - Vi zp)? — Az 2"Pope2 
2zpe2 
the choice of the sign of the radical being determined by letting z — 0. 
pect mV = Be | , Po = Pr; 
2p* PoP2, Po < Pr. 
In the case ~, > O we have g = 1 and then by (21) 


p= 





(n — 1)! 
Pr = oiaeteat Po° pi’ ps” 
votrytro=n vo! V1 ly» 
Vyt2vg=n—-1 


which can also be obtained by expansion of (42) according to powers of z. From 
(29) we get 





Pr ™ ri 1. n/t + 2pop2 a) (pi + 2 po p2)” a, 


In the case p; = 0 we have gq = 2 and obtain from (42) or from (29) 


Vy — » 2x— 
F(z) = > (— 1)" eA 9” \»? pig 


v=l 


aa (2p oe 2)! vy yl 2-1 
= s10—-11™ , ’ 


which shows 
(0 ‘ n =2); 


P, ne ) (2p ai 2)! » vl 


= 2 — 1. 
loi@ = 11 PoP ’ v 


By direct use of Stirling’s formula or from (29) we get 


1 —t 3/2 
Poy ow! Da 4/22 "(pope)" (2v _ 


EXAMPLE 2. Wetake f(w) = e“””, \ > 0, so that W has a Poisson distribu- 
tion. Then p = «,q = 1, and we get from (26) and (27) 


f(a) wa geo as a/a, 
P@) =re°” = 1/a, 
a=1/), a=e'/X. 


Clearly we have a > 1 if and only if \ < 1 and in this case 1| is evidently the 
only solution for w of ec” = w, hence p = 1. On the other hand if \ <1 








224 RICHARD OTTER 


then (41) gives p = 1 — 2(A — 1)\"*.. By (21) we get 


(nd)** —nr 
= a e 





P, 


? 


and by direct use of Stirling’s formula or from (29) we get 


1 1—A —1 _—3/2 
Oe le ee ee 
2r 


REFERENCES 


{1] R. A. Fisuer, The Genetical Theory of Natural Selection, Oxford, The Clarendon Press 
1930. 

[2] A. J. Lorxa, Theorie Analytique des Associations Biologiques 2, Hermann and Co., 
Paris, 1939. 

[3] D. Hawkins anv S. Utam, Theory of Multiplicative Processes I, MDDC-287, 1944. 

[4] T. E. Harris, ‘‘Some theorems on the Bernoullian multiplicative process,’’ thesis, 
doctor of philosophy, Princeton University, 1947. 

[5] A. M. Yactom, “‘Certain limit theorems of the theory of branching random processes,”’ 
Doklady Akad. Nauk. SSSR(N. S.) Vol. 56 (1947), 795-798. 

[6] D. G. KENDALL, ‘‘On the generalized ‘‘birth-and-death”’ process,’’ Annals of Math. 
Stat., Vol. 19 (1948). 

[7] A. Kotmocororr, ‘‘Zur Lésung einer biologischen Aufgabe,’’ Mitt. Forsch.- Inst. Math. 
u. Mech. Univ. Tomsk, Vol. 2 (1938), pp. 1-6. 

(8] L. Pontrsacin, Topological Groups, Princeton Univ. Press, 1946. 

[9] J. von NEUMANN, Functional Operators, mimeographed notes, Institute for Advanced 
Study, 1933-35. 

{10] A. Kotmocororr, Grundbergriffe der Wahrscheinlichkeitsrechnung, Chelsea Publishing 
Co., New York, 1946. 

[11] A. Hurwitz ano R. Courant, Funktionentheorie, Springer, Berlin, 1929. 

[12] M. G. Kenpatu, The Advanced Theory of Statistics, Vol. I, Griffin Co., London, 1948. 


APPLICATION OF THE RADON-NIKODYM THEOREM TO THE 
THEORY OF SUFFICIENT STATISTICS! 


By Paut R. Haumos? anno L. J. SAVAGE 
University of Chicago 


Summary. The body of this paper is written in terms of very general and 
abstract ideas which have been popular in pure mathematical work on the theory 
of probability for the last two or three decades. It seems to us that these ideas, 
so fruitful in pure mathematics, have something to contribute to mathematical 
statistics also, and this paper is an attempt to illustrate the sort of contribution 
we have in mind. The purpose of generality here is not to solve immediate 
practical problems, but rather to capture the logical essence of an important 
concept (sufficient statistic), and in particular to disentangle that concept from 
such ideas as Euclidean space, dimensionality, partial differentiation, and the 
distinction between continuous and discrete distributions, which seem to us 
extraneous. 

In accordance with these principles the center of the stage is occupied by a 
completely abstract sample space—that is a set X of objects x, to be thought 
of as possible outcomes of an experimental program, distributed according to an 
unknown one of a certain set of probability measures. Perhaps the most familiar 
concrete example in statistics is the one in which X is n dimensional Cartesian 
space, the points of which represent n independent observations of a normally 
distributed random variable with unknown parameters, and in which the 
probability measures considered are those induced by the various common 
normal distributions of the individual observations. 

A statistic is defined, as usual, to be a function 7 of the outcome, whose 
values, however, are not necessarily real numbers but may themselves be abstract 
entities. Thus, in the concrete example, the entire set of m observations, or, 
less trivially, the sequence of all sample moments about the origin are statistics 
with values in an n dimensional and in an infinite dimensional space respectively. 
Another illuminating and very general example of a statistic may be obtained as 
follows. Suppose that the outcomes of two not necessarily statistically inde- 
pendent programs are thought of as one united outcome—then the outcome T 
of the first program alone is a statistic relative to the united program. A 
technical measure theoretic result, known as the Radon-Nikodym theorem, is 
important in the study of statistics such as T. It is, for example, essential 
to the very definition of the basic concept of conditional probability of a subset 
E of X given a value y of T. 

The statistic T is called sufficient for the given set Mt of probability measures 


1 This paper was the basis of a lecture delivered upon invitation of the Institute at the 
meeting in Chicago on December 30, 1947. 
2 Fellow of the John Simon Guggenheim Memorial Foundation. 


225 








226 PAUL R. HALMOS AND L. J. SAVAGE 

if (somewhat loosely speaking) the conditional probability of a subset F of X 
given a value y of T is the same for every probability measure in Mt. It is, for 
instance, well known that the sample mean and variance together form a sufficient 
statistic for the measures described in the concrete example. 

The theory of sufficiency is in an especially satisfactory state for the case 
in Which the set 9% of probability measures satisfies a certain condition described 
by the technical term dominated. A set YM of probability measures is called 
dominated if cach measure in the set may be expressed as the indefinite integral 
of a density function with respect to a fixed measure which is not itself necessarily 
in the set. It is easy to verify that both classical extremes, commonly referred 
to as the discrete and continuous cases, are dominated. 

One possible formulation of the principal result concerning sufficiency for 
dominated sets is a direct generalization to the abstract case of the well known 
Fisher-Neyman result: 7 is sufficient if and only if the densities can be written as 
products of two factors, the first of which depends on the outcome through T 
only and the second of which is independent of the unknown measure. Another 
way of phrasing this result is to say that 7’ is sufficient if and only if the likelihood 
ratio of every pair of measures in 3)! depends on the outcome through T only. 
The latter formulation makes sense even in the not necessarily dominated case 
but unfortunately it is not true in that case. The situation can be patched up 
somewhat by introducing a weaker notion called pairwise sufficiency. 

In ordinary statistical parlance one often speaks of a statistic sufficient 
for some of several parameters. The abstract results mentioned above can 
undoubtedly be extended to treat this concept. 


1. Basic definitions and notations. A measurable space (Y, S) is a set X 
and a o-algebra S of subsets of X.° If (X, S) and (1, 7) are measurable 
spaces and if 7 is a transformaticu from X into Y (or, in other words, if T 
is a function with domain X and range in Y), then T is measurable if, for every F 
in T, T'(F) eS. If Y is a Borel set in a finite dimensional Euclidean space, 
then we shall always understand that 7 is the class of all Borel subsets of Y, 
and the measurability of a function f from X to Y will be expressed by the 
notation f(e) S. 

Throughout most of what follows it will be assumed that (Y, S) and (VY, 7) 
are fixed measurable spaces and that 7 is a measurable transformation (also 
called a statistic) from Y onto Y. <A helpful example to keep in mind is the 
Cartesian plane in the role of X, its horizontal coordinate axis in the role of Y, 
and perpendicular projection from X onto Y in the role of T. 

The following notations will be used. If g is a point function on Y (with 
arbitrary range), then gT is the point function on X defined by gT (x) = g(T(a)). 
If » is a set function (with arbitrary range) on S, then n7” is the set function 


3 A g-algebra is a non empty class S of sets, closed under the formation of complements 
and countable unions. If (X, S) is a measurable space, the sets of S will be called the 
measurable sets of X. 


its 
she 


SUFFICIENT STATISTICS 227 


on T defined by wT (F) = u(T'(F)). The class of all sets of the form T"(F), 
with F e 7, will. be denoted by T'(T); the characteristic function of a set A 
(in any space) will be denoted by xa. 

LemMa 1. Jf ¢ # any fumetion on Y and A is any set in the range of g, then 


faz g(x) A} = T (hy: gly) € A}): 


hence, in particular, xr-1(732S"%X~T For every subset F of Y." 

Proor. The following ‘statements are mutually equivalent: (a) a «€ 
fx: gT (x) € A}, (b) g(T(ao)) €@ A, fe) PY = T(x), then g(yo) «A, and (d) 
T(xo) « }y: gly) eA}. The equivalence of the first and last ones of these 
statements is exactly the assertion of the lemma. 

We shall have frequent occasion to déal\with functions on Y which are induced 
by measurable functions on Y; the following result is a useful and direct structural 
characterization of such functions. 

Lemma 2. If f is a real valued function on X, then a necessary and sufficient 
condition that there exist a measurable function g on Y such that f = gT is that 
f(€) T (1); if such a function g exists, then it is unique.” 

Proor. The necessity of the condition is clear. To prove sufficiency, 
suppose that f (e) T (7), ye Y, and write X» = T'({yo}). Suppose 2 € Xo 
and write E = {x:f(x) = f(a)}. Since f (e) T'(T), there is a set F in T such 
that E = JT '(F). Since a ¢ E, it follows that yo) « F and therefore that 


X) = T(fyo}) CTF) = E. 


In other words f is a constant on Xo and consequently the equation g(yo) = f(xo) 
unambiguously defines a function g on Y. The facts that f = gT and that g is 
measurable are clear; the uniqueness of g follows from the fact that 7 maps 
X onto Y. 


2. Measures and their derivatives. A measure is a real valued, non negative, 
finite (and therefore bounded), countably additive function on the measurable 
sets of a measurable space.° An integral whose domain of integration is not 
indicated is always to be extended over the whole space. If the symbol 
[u], pronounced “modulo yz”, follows an assertion concerning the points x of 
X, it is to be understood that the set E of those points for which the asser- 
tion is not true is such that EF eS and u(E) = 0. Thus, for instance, if f 
and g are functions (with arbitrary range) on X, then f = g [u] means that 


4The symbol |— : —} stands for the set of all those objects named before the colon 
which satisfy the condition stated after it. 

5 The notation f (e) T-!(7) means of course that f is a measurable function not only on the 
measurable space (X, S) but also on the measurable space (X, T-'(T)). The restriction to 
real valued functions is inessential and is made only in order to avoid the introduction 
of more notation. 

6 Although most of the measures occurring in the applications of our theory are probability 
measures (i.e. measures whose value for the whole space is 1), the consideration of probabil- 
ity measures only is, in many of the proofs in the sequel, both unnecessary and insufficient. 








228 PAUL R. HALMOS AND L. J. SAVAGE 


u({x: f(x) # g(x)}) = 0. Similarly, if f is a real valued function on X, then 
f (-) T(t) lal means that there exists a real valued function. g on X such 
that g (e) T” ‘(T) andf = g [ul]. ' 

If » and v are two measures on S, »v is absolutely eiiiltninkaas with respect to : 
in symbols v < uy, if v(E) = O for every measurable set E for which y(E) = 
The measures yp and » are equivalent, in symbols » = », if simultaneously pz pe v 
and vy <u.’ One of the most useful results concerning absolute continuity is the 
Radon-Nikodym theorem, which may be. stated as follows.® 

A necessary and sufficient condition that v «& yp is that there exist a non negative 
function f on X such that 


ce) = fp) auc 


for every Ein S. The function f ts unique in the sense that if also 
WE) = | g(x) du(e) 


for every Ein S, thenf = gu). If v(E) S w(E) for every E in S, then O S f(x) 
S 1 uJ. 

It is customary and suggestive to write f = dv/du. Since dv/dy is determined 
only to within a set for which pu vanishes, it follows that in a relation of the form 


dv pan 
du (e) T(T) (u] 


the symbol [uy] is superfluous and may be omitted. 

For typographical and heuristic reasons it is convenient sometimes to write the 
relation f = dv/dy in the form dv = fdy; all the properties of Radon-Nikodym 
derivatives which are suggested by the well known differential formalism cor- 
respond to true theorems. Some of the ones that we shall make use of are 
trivial (e.g. dy, = fidu and dvy = fodu imply d(m + ve) = (fi + f2)du), while 
others are well known facts in integration theory (e.g. (i) d\ = fdvy and dv = gdu 
imply d\ = fgdu, and (ii) dv = fdu and du = gqdv imply fg = 1 [u)). 

We conclude this section with a simple but useful result concerning the 
transformations of integrals. 

Lemma 3. If g is a real valued function on Y and yp is a measure on S, then 


[ oara) = fy oP @) due) 


for every F in T, in the sense that if either integral exists, then so does the other and 
the two are equal. 


7It is clear that the relation of equivalence is reflexive, symmetric, and transitive, 
and hence deserves its name. 

8 For a proof of the Radon-Nikodym theorem and similar facts concerning the measure 
and integration theory which we employ, see S. Saks, Theory of the Integral, Warszawa— 
Lwéw, 1937. 


SUFFICIENT STATISTICS 229 


Proor. Replacing g by gxr we see that it is sufficient to consider the case 
F = Y. The proof for this case follows from the observation that every ap- 
proximating sum 


=: g(yiuT (Fi) 


oo oe . . 
of [out is also an approximating sum 


Zi gl (xi)u(Ei) 


» "WW 9 
of | gTdu, and conversely. 
My, : 


3. Conditional probabilities and expectations. Lemma 4. Jf uw and v are 
measures on § such that v Kp, then vl K pT. 
Proor. If F ¢ Tand0 = uT (F) = u(T(F)), then 


0 = oT (F)) = vT(F).” 


Lemma + is the basis of the definition of a concept of great importance in 
probability theory. If u is a measure on S and f is a non negative integrable 
function on X,+then the measure v defined by dv = fdy is absolutely continuous 
with respect to nu. It follows from Lemma 4 that vT ’ is absolutely continuous 
with respect to u7”’; we write dvT”* = gduT*. The function value g(y) is 
known as the conditional expectation of f given y (or given that T(x) = y); we 
shall denote it by e,(f | y). If f = xz is the characteristic function of » set E in 
S, then e,(f | y) is known as the conditional probability of E given y; we shall 
denote it by p,(E | y)." 

The abstract nature of these definitions makes an intuitive justification of 
them desirable. Observe that since v7” (F) = »(T(F)) = Bs f(x) du(x), 


the defining equation of e,(f | y), written out in full detail, takes the form 


[.,, Hodue) = foi auT*y, Fer. 


* It is of interest to observe that either side of the equation in Lemma 3 may be obtained 
from the other by the formal substitution y = T(x). A special case of this lemma is the 
celebrated and often misunderstood assertion that the expectation of a random variable is 
equal to the first moment of its distribution function. 

10 That the converse of Lemma 4 is not true is shown by the following example. Let X be 
the unit square, let ) be the unit interval, and let 7’ be the perpendicular projection from 
X onto Y. Let uw be ordinary (Borel-Lebesgue) measure and let v be linear measure on the 
intersection of Y with, say, the horizontal line whose ordinate is 3. Clearly v is not abso- 
lutely continuous with respect to yw, but »7! = pT. 

11 Definitions in this form were first proposed by A. Kolmogoroff, Grundbegriffe der 
Wahrscheinlichkeitsrechnung, Berlin, 1933. With a slight amount of additional trouble, 
conditional expectation could be defined for more general functions, but only the non 
negative case will occur in our applications. 


J 
wat 








230 PAUL R. HALMOS AND L. J. SAVAGE 
If f = xz, then this equation becomes the defining equation of p,(E | y): 
mE aT) = | p(E\y) dT", Feet 
PF 


The customary definition of “the conditional probability of E given that T(x) ¢ F” 
is u(E nT '(F))/u(T ‘(F)), (assuming that the denominator does not vanish), 
Since un(T'(F)) = wT '(F), we have 


_* TF) — 1 | ae oy 
~u(T(P)) a uT—(F) [ Pu | y) duT (y). 


It is now formally plausible that if “F shrinks to a point y,” then the left side 
of the last written equation should tend to the conditional probability of E 
given y and the right side should tend to the integrand p,(E | y). The use of 
the Radon-Nikodym differentiation theorem is a rigorous substitute for this 
rather shaky difference quotient approach. 

Since p,(E | y) is determined, for each E, only to within a set for which pT 
vanishes, it would be too optimistic to expect that, for each y, it behaves, regarded 
as a function of E, like a measure. It is, however, easy to prove that 
(i) p(X | y) = 1 [ut], 
ji) Op (E|y) $1 (eT, 

(iii) if {E,} is a disjoint sequence of measurable sets, then p,(U%- E, | y) = 
0 | ly ES 
Dont Pu(En | y) (wT. 

The exceptional sets of measure zero depend in general on E in (ii) and on the 
particular sequence {E,} in (iii). It is interesting to observe that, despite the 
fact that » need not be a probability measure, p, turns out always to have the 
normalization property (i). It is natural to ask whether or not the indeterminacy 
of p,(E | y) may be resolved, for each E, in such a way that the resulting function 
is a measure for each y, except possibly for a fixed set of y’s on which yT™ 
vanishes. Doob"™ has shown that this is the case when X is the real line; in the 
general case such a resolution is impossible. Fortunately, however, conditional 
probabilities are sufficiently tractable for most practical and theoretical purposes, 
and the requirement that they should behave like probability measures in the 
strict sense described above is almost never needed. 





12 We observe that it is not suffidient to require this for F = Y only, i.e. to require 
u(E) = § py(E| y) duT—(y). This special equation is satisfied by many functions which 
do not deserve the name conditional probability; e.g. it is satisfied by pu(E | y) = 
constant = u(F)/puT(Y). 

13 See J. L. Doob, “Stochastic processes with an integral-valued parameter,” Am. Math. 
Soc. Trans., Vol. 44 (1938), pp. 95-98. 

14 See Doob, loc. cit. Doob asserts the theorem in much greater generality, but his 
proof is incorrect. The error in the proof and a counterexample to the general theorem 
were communicated to us by J. Dieudonné in a letter dated September 4, 1947. Doob’s 
proof is valid for more general spaces than the real line (e.g. for finite dimensional Euclidean 
spaces and for compact metric spaces). The details of Dieudonné’s counterexample will 
appear in a forthcoming book (entitled Measure theory) by Halmos. 


mr 


we er ONS OY 


SUFFICIENT STATISTICS 231 


We conclude this section with two easy but useful results which might also 
serve as illustrations of the method of finding conditional probabilities and 
expectations in certain special cases. 

Lemma 5. If wis a measure on S, if g is a non negative function on Y, integrable 
with respect to nT’, and if v is the measure on S defined by dv = gTdy, then 
dvT* = gduT’, or, equivalently, e,(gT | y) = g(y) (uT’). 


Proor. From »(£) = [ gT (x) du(x) and Lemma 3 it follows that 
E 


»T(F) = »(T(F)) = [ g(y) duT™(y). 


Lemma 6. If wis a measure on S, if f and g are non negative functions on X and 
Y respectively, and if f, gI, and f-gT are all integrable with respect to u, then 


ev(f-gT | y) = ex(f | y)-g(y) eT). 


Hence, in particular, if F e T, then 


p(E nT '(F) | y) = p(E | y)xey) wT] 
for every E in S. 


Proor. If dy = fdu, then, by definition of e, , »T(F) = / e,(f | y) duT(y). 
Pr 


Applications of Lemmas 3 and 5 yield 


[ ei vo@) duty) = fo) ara) =f oT@) dv) 


-[ “sep, FOOTE) du(e) = | e(f-o8 |v) duty), 


and therefore the desired conclusion follows from the uniqueness assertion of the 
Radon-Nikodym theorem. 


4. Dominated sets of measures. In many statistical situations it is necessary 
to consider simultaneously several measures on the same o-algebra. The 
concept of absolute continuity is easily extended to sets of measures. If M 
and ® are two sets of measures on S and if, for every set E in S, the vanishing of 
u(E) for every pu in Mt implies the vanishing of v(E£) for every v in N, then we 
shall call 9t absolutely continuous with respect to Mt and write NR << M. If 
N<«K Mand M KN, the sets M and MN are called equivalent and we write It = N. 
If, in particular, ))t contains exactly one measure p, Jt = {u}, the abbreviated 
notations ®N K pw, uw KN, and uw = MN, will be employed for N <K M, M<«KMN, and 
Mm = MN, respectively. 

A set 2 of measures on S§ will be called dominated if there exists a measure \ 
on § (not necessarily in 9) such that Nt <<. In applications there frequently 
occur sets of measures which are dominated in a sense apparently weaker than 
the one just defined—weaker in that the measure \, which may for instance be 








232 PAUL R. HALMOS AND L. J. SAVAGE 


Lebesgue measure on the Borel sets of a finite dimensional Euclidean space, 
is not necessarily finite. It is easy to see, however, that whenever \ has the 
property (possessed by Lebesgue measure) that the space X is the union of 
countably many sets of finite measure, then a finite measure equivalent to \ 
exists and the two possible definitions of domination coincide. 

The following result on dominated sets of measures may be found to have 
some interest of its own and will be applied in the sequel. 

LemMa 7. Every dominated set of measures has an equivalent countable subset. 

Proor. Let St be a dominated set of measures on S, Yt <A; for any win M 
write f, = du/d\ and K, = {x:f,(z) > 0}. We define (for the purposes of this 
proof only) a kernel as a set K in S such that, for some measure u in M, A C K, 
and w(K) > 0; we define a chain as a disjoint union of kernels. Since \(K) > 0 
for every kernel K, it follows from the finiteness of \ that every chain is a countable 
disjoint union of kernels. It follows, also from these definitions that if C is a 
measurable subset of a chain, such that u(C) > 0 for at least one measure u in Ni, 
then C is a chain, and that a disjoint union of chains is a chain. The last two 
remarks imply, through the usual process of disjointing any countable union, 
that a countable (but not necessarily disjoint) union of chains is a chain. 

Let {C;} be a sequence of chains such that, as 7 — «, A(C;) approaches 
the supremum of the values of \ on chains. If C = Uji, C;, then C is a chain 
for which \(C) is maximal. The definition of a chain yields the existence of a 
sequence {K;} of kernels such that C = Uf, K; , and the definition of a kernel 
yields the existence, for each 7 = 1, 2, --- , of a measure yu; in Mt such that 
K; C K,, and ui(K;) > 0. We write % = {u1, mw, ---}; since N C M, the 
relation % < Mi is trivial. We shall prove that Mt <M. 

Suppose that # eS, ui(K) = Oforz = 1, 2, --- , and let uw be any measure in 
Mt. It is to be proved that u(Z) = 0. Since w(H — K,) = 0, there is no loss of 
generality in assuming that EK C K,. If u(E — C) > 0, then AE — C) > 0 
and therefore (since E — C is akernel) E u C is a chain with AXE u C) > ACC). 
Since this is impossible, it follows that un(E — C) = 0. Since 0 = u,(F) = 
ui(E n K;) = / f,.; @\ and since K; C K,, , it follows that \(E n K,) = 0. 


ENK; 
We conclude that \(En C) = >of (En K;) = O and therefore u(E nC) = 0. 
Since u(Z) = u(E — C) + u(E nC), the proof of the lemma is complete. 


5. Sufficient statistics for dominated sets. The statistic T is sufficient for a 
set Nt of measures on S if, for every FE in S, there exists a measurable function 
p = p(E | y) on Y, such that 


pE | y) = p(E | y) (4T] 


for every nin M.’”° In other words, T is sufficient for M if there exists a condi- 

15 The original definition of sufficiency was given by R. A. Fisher, “On the mathematical 
foundations of theoretical statistics,” Roy. Soc. Phil. Trans., Series A, Vol. 222 (1922), 
pp. 309-368. 


oe 


1e 
of 


ye 


oF a FoVMas> 


Ad 


in 


el 
ut 


of 


ss & 


SUFFICIENT STATISTICS 233 


tional probability function common to every yu in WY, or, crudely speaking, if the 
conditional distribution induced by T is independent of u. 

THEOREM 1. A necessary and sufficient condition that the statistic T be sufficient 
for a dominated set IN of measures on S is that there exist a measure d on S such that 
M = rand such that du/dd (e) T'(L) for every uw in M. 

Proof of necessity. Let % = {m, we,---} be a countable subset equivalent 
to Yt (Lemma 7), and write \ for the measure on S defined by 


ME) = dof au:(E), 
where a; = 1/2‘u;(X),i = 1,2,---. Clearly M =. 


If p is a conditional probability function common to every p in M, then, for 
every F in T, 


ME nT \(F)) = ha am(E n T(P)) 


= Dim a: | P(E | y) duiT'(y) = [ p(E | y) d\T'(y), 
F F 


ie. p serves also as a conditional probability for X. 
Take any fixed u in NM, write du/d\ = f, and e(f | y) = g(y); then duT™ = 
gd\T *, and we have, for every E in S, 


[ s@) ow = ME) = [| rE ly) aeT@ 
E 
= [ rE i yaw are) = f ees! we@T | y) aT) 


= | eGce-ot \y) AT (y) = [ xe@aT@) d(x) = [ g(x) dX(x). 
E 


The desired result, f(x) = gT(x) [A], follows from a comparison of the first and 
last terms in the last written chain of equations. 

Proof of Sufficiency. We shall prove that p, is a conditional probability 
function common to every uw in Mt. Take any fixed EF in S and uw in M and 
write du/d\ = gT. If the measure v is defined by dv = xsdu, then dvT = 
p,duT*, where p, = p,(E|\y). The hypothesis du = gTd\ implies that 
duT ' = gaXT" and hence that 


dvT* = p,-gddT. 
On the other hand dv = yxedu = xz-gTdyX, so that 
dvT* = eddT", 


where e, = e(xe:97' | y) = pr(E | y)g(y). It follows from a comparison of 
the two expressions for dyT~ that 


vaE | Waly) = prE | y)g(y) (ATI. 








234 PAUL R. HALMOS AND L. J. SAVAGE 


Since the relation duT' = gd\T™' clearly implies that g(y) ¥ 0 [u7"’] (i.e. 
that wT” "({y: g(y) = 0}) = 0), it follows, finally that 


pu(E | y) = p(B | y) uT). 


6. Special criteria for sufficiency. Theorem 1 may be recast in a form more 
akin in spirit to previous investigations of the concept of sufficiency.® 

CoROLLARY 1. A necessary and sufficient condition that the statistic T be 
sufficient for a dominated set It (KK ro) of measures on S is that, for every win M, 
f, = du/ddo be factorable in the form f, = g,:t, whereO S g, (e€) T (T), 0 St, t 
and g,:t are integrable with respect to \) , and t vanishes [Xo] on each set in S for 
which every uw in M vanishes. 

In more customary statistical language the condition asserts essentially that 
“each density is factorable into a function of the statistic alone and a function 
independent of the parameter.”’ 

Proor. If 7 is sufficient for Mt, then there exists a measure A with the 
properties described in Theorem 1. It follows that 


and we may write g, = du/d\ and t = dd\/d\). The only assertion that is not 
immediately obvious is the one concerning the vanishing of t. To prove it, 
suppose that u(E) = 0 for every u in MM; the fact that then 


0 =X(E) = / (x) ddo(x) 


implies the desired conclusion. 

If, conversely, f, = g,-t, then we may write dX = tdd). The relation It = A 
follows from the statement concerning the vanishing of ¢, and the relation 
du/dd (e) T’(T) is implied by the equation du = g,-td\) = g,dn. 

For the statement of the next consequence of Theorem 1 it is convenient to 
call a set 92 of measures on S homogeneous if p = v for every uw and v in Me. 

CoroLLaRy 2. A necessary and sufficient condition that the statistic T be 
sufficient for a homogeneous set I of measures on S is that, for every w and v in M, 
dv/dp (e) T'(T). ‘ 

Proor. Since a homogeneous set is dominated (by any one of its elements), 
Theorem 1 is applicable. If 7 is sufficient for 2% and if \ has the properties 
described in Theorem 1, then dv/du = (dv/dd)/(du/dd). The converse follows, 
through Theorem 1, by letting \ be any measure in Mi. 

We shall say that the statistic T is pairwise sufficient for a set Yt of measures 


16 See J. Neyman, “Su un teorema concernente le cosiddette statistiche sufficienti,” Jnst. 
Ttal. Atti. Giorn., Vol. 6 (1935), pp. 320-334. In this paper Neyman is somewhat restricted 
by his use of classical analytical methods, but he points out the possibility and desirability 
of extending his results to a much more general domain. For a recent presentation of the 
theory and further references to the literature cf. H. Cramér, Mathematical Methods of 
Statistics, Princeton, 1946. 


(i.e, 


for 


hat 
ion 


the 


not 
it, 


uw 
> 


ion 


res 


rst. 
ted 
ity 
the 


SUFFICIENT STATISTICS 235 


on S if it is sufficient for every pair {y, v} of measures in 9. In other words, 
T is pairwise sufficient for 2 if, for every EF in S and yu and »v in M, there exists a 
measurable function p,.(E | y) on Y such that 


pE \y) = pwl(E\y) (eT "] and p{E\y) = pw(E| y) Tl. 


Since pairwise sufficiency is (at least apparently) weaker than sufficiency, it is 
not surprising that there is a simple criterion for it even in the case of quite 
arbitrary (not necessarily homogeneous or dominated) sets of measures. 

CorRo.LuaRY 3. A necessary and sufficient condition that T be pairwise sufficient 
for a set M of measures on S ts that, for any two measures wand v in M, du/d(u + v) 
(ec) T (T). 

Proor. If T is sufficient for wu and vy, then there exists a measure \ = p + v 
such that du/dd (e) T (7) and dv/dd (e) T'(T). It follows that 


d(u + v) du (¥ 4 
d(u - ->) t/a 7 -%/ dy - dy} 


The sufficiency of the condition follows immediately by applying Theorem 1 
to the two-element set {y, v}. 


7. Pairwise sufficiency and likelihood ratios. It is sometimes convenient to 
express the result of Corollary 3 in slightly different language. If \ is a measure 
on S and if f and g are real valued measurable functions on X such that 
\({a: f(x) = g(x) = O}) = O, we shall say that the pair (f, g) is admissible [)]. 
(Intuitively an admissible pair (f, g) is to be thought of as a ratio f/g, which, 
however, may not be formed directly at the points x for which g(x) = 0.) Two 
admissible pairs (fi, gi) and (fo, ge) will be called equivalent [A], in symbols 
(f:, 91) = (fe, ge) [A], if there exists a real valued measurable function t on X 
such that ¢(x) ~ 0 [A] and such that fi; = tf. and g, = tg: [A]. It is clear that the 
relation ‘ = [A]’’ is indeed an equivalence; the equivalence class containing the 
admissible pair (f, g) will be called the ratio of f and g and will be denoted by 
f\g. (A ratio may accordingly be described as a measurable function from X 
to the real projective line.) For a ratio f |g we shall write f |g (e) T'(T) [A] 
if the equivalence class f | g contains a pair (fo, go) which is admissible [A] and 
for which fo (€) T '(7) and go (e) T'(T). 

Lemma 8. Jf pu, v, 1, and dX» are measures on S such that uw + v K dy and 

vw <KNo, then the pairs (du/ddy , dv/dd) and (du/dd». , dv/dd2) are admissible 
lu + vj} and equivalent [u + v). 

Proor. The admissibility of, for instance, (du, d\y , dv/d\) follows from the 

fact that du/dd\y; ~ 0 [uv] and dv/dd\y # 0 [v], whence 


, : dv , | 
(u + v) (3 wi F (zx) an, o}) 


To prove equivalence, we write \; + A: = A. Since 


du dy: _ du _ du dro dv dky _ dv _ dv dro 


dy; dX dy dd. dn’ dy; dX dy dd. dr’ 








236 PAUL R. HALMOS AND L. J. SAVAGE 


since also d\;/d\ ~ 0 [Mi] and therefore d\i/d\ ¥ O [uv + v], and since, similarly, 
dd2/d\ ~ O [u + v], the conditions of the definition of equivalence are satisfied 
by t = (dd\2/dd)/(d\i/dd). 

If » and v are any two measures on § and if \ is any measure on S such that 
uw + v <> (for instance if \ = yw + »), then the ratic du/dd | dv/dy, which 
according to Lemma 8 exists [u + »v] and is independent of X, will be called the 
likelihood ratio of u and v and will be denoted by du | dv. The result of Corollary 
3 may be expressed in terms of likelihood ratios as follows. 

THEOREM 2. A necessary and sufficient condition that T be pairwise sufficient 
for a set M of measures on S is that, for any two measures w and v in M, 
du | dv (e) T"(T). 

Proor. If T is sufficient for uw and v, then, by Corollary 3, du/d(u +- v) (6) 
T(T), dv/d(u + v) (e) T (7), and, by Lemma 8, (du/d(u + v), dv/d(u + »)) is 
an admissible pair belonging to the equivalence class du ' dv. Suppose converscly 
that f = du/d(u + v), g = dv/d(u + v), and let the real valued measurable 
functions ft, fo, and go be such that ¢ ~ 0 [uy + v], fo (2) T (DT), go (e) T(T), 
(fo , go) is admissible [u + y], and 


f=thf, g = t-go lu t+ »}. 


Since f and g are non negative, it follows that f = .¢,-, fo; and g = |¢j-| go| 
[u + vj, i.e. that there is no loss of generality in assuming that t, fo , and g are 
non negative. The relation f + g = 1 [u + »| implies that ¢-(f6 + go) = 1 
[u + v]: the fact that (fo, go) is admissible [u + v] then yields te T'(T). The 
proof is completed by comparing this result with the expressions for f and g in 
terms of fy and go and applying Corollary 3. 


8. Pairwise sufficiency versus sufficiency. In order to show that our results 
on pairwise sufficiency (in the preceding section and in the sequel) are not 
vacuous, we proceed now to exhibit a statistic which is, for a suitable set of 
measures, pairwise sufficient but not sufficient. 

Let X = {(z,72):0 S$ x S 1,2 = 0, 1} be the union of two unit intervals and 
let Y = {y:0 S y S 1} be a unit interval. In accordance with our basic 
convention, measurability in both X and Y is to be taken in the sense of Borel. 
The statistic 7 is defined by T(z, 7) = a. 

Write Yo = {(z,0):0 S$ a's 1} and X, = {(#7,1):0 S281}. Let ube 
(linear) Lebesgue measure on the class § of Borel subsets of Y, and define, 
whenever Fe Sand0 Sal 


? 
Halk) ms >{w(E n Xo) + Xn.x (a, 1)}. 
Let v be (linear) Lebesgue measure on the class T of Borel subsets of Y, and 


define, whenever / ¢ TandO S a S 1 


= = ’ 


val’) = 31¥(F) + xr(a)}. 


; anal : 
Clearly va = pal” ; we write Pi = twa :OS aS! 


ly, 


ed 


Lat 
ich 
the 
Tay 


ent 
Ne, 


(e) 
) is 
cly 
ble 
T) 


? 


| 
Jo | 
are 


“he 
1 in 


ults 
not 


of 
ind 
isic 


rel. 


~be 
ine, 


ind 


SUFFICIENT STATISTICS 237 


If 6(y, a) is defined to be 1 or 0 according as y = a or y ¥ a, if 4’(y, a) = 
1 — 6(y, a), and if 
| \ -/ e/ \ \ 
Dall | y) = 5 (y, &)xely, 0) + d(y, &)xely, 1), 


then a straightforward computation shows that 
uE nT (FY) = [ PalE | y) dvaly), 
F 


so that pa(E | y) = pu, (FZ | y) [vel]. 
It is now easy to verify that 7 is pairwise sufficient for Yt. Indeed if a and 3 
are any two different numbers in the closed unit interval, we may write 
p(E | y) = 5 (y, a) (y, 8)xe(y, 0) + [5(y, a) + 4(y, B)lxey, 1). 
Since {y: P(E | y) # pa(E | y)} = (B} and ty: p(E | y) = ps(E | y)} = (aj, it 
follows that p(E | y) = pa(E | y) [va] and p(E | y) = pa(E | y) [vs]. 


To prove that 7 is not sufficient for Mt we observe that po(Xily) = 
5(y, «)xx,(y, 1) = 6(y, w) and therefore 


Pua(X1 | y) = 5(Y, @) [va]. 


Suppose that there is a conditional probability function p such that p(E | y) = 
Pu.(E | y) [va]. Then, in particular, 


p(Xi|y) = dy, @) [va]. 
Since va({a}) = 3 > 0, it follows that 
p(X | a) = d(a, a) = 1, 


or, changing to a more suggestive notation, that p(X1| y) = 1 for all y. We 
have, however, 


valty: Pa(Xi | y) = Of) = valiy: dy, a) = 0}) 
= vellyz:y ~ a}) = 2, 


so that va({y: Pu,(X1 | y) = 0}) = 3. This contradiction shows the impossibility 
of the existence of a conditional probability function common to every uw in MW. 

This example shows also that, in a sense, sufficiency is more fundamental 
than pairwise sufficiency. If, for instance, we imagine that it is important to a 
statistician that he either estimate a sharply or refrain from estimating it 
altogether, then he is by no means as well off with the observation of y as with 
that of x. 


9. Pairwise sufficiency for dominated sets. We now proceed to show that 
for dominated sets of measures no such example as the one in the preceding 
section exists, or, in other words, that for dominated sets the concepts of pairwise 
sufficiency and sufficiency do coincide. 








238 PAUL R. HALMOS AND L. J. SAVAGE 


Lemma 9. If T is pairwise sufficient for a set {uo, ui, we} of three measures 
on S, then” 


do -1 
d(uo + wi + ue) — (7) 
Proor. According to Corollary 3, 
duo a duo 4 
=_- —_——— a. c= So F ° 
fi d(uo + 1) ()T'(7) and fs d(up + ye) we 


Since dp = fid (uo — bt) = fod (uo —- be), we have fiduo = fifod (uo + be) and 
foduo = fifod(uo + mm), so that 


(fi + fe — fife)duo = fifed(uo + wi + pe). 
If we write du) = fd(uo + ui + me), then it follows that 
(f: + fo = fife)f = fife [uo + Mi + us). 


Since 0 S f; S landO S fe S 1, the equation f; + fo — fife = O is equivalent 
to fi = fe = 0. Since wo({x: fi(x) = fo(x) = 0}) = O, it follows that f may be 
redefined, if necessary, to be 0 on the set {x: fi(x) = fo(v) = 0} without affecting 
the relation dup = fd(uo + ui + me); since outside this set f = fife/(fi + fo — fife), 
the proof of the lemma is complete. 

Lemma 10. Jf T is pairwise sufficient for a finite set {uo, mi, °° , ue} of 
measures on S, then dup roam ui) (e) T (7). 

Proor. For k = 1 the conclusion is a restatement of the hypothesis; we 
proceed by induction. Given wo, 41, °** , #41, We write n= > %4y;. Then 
duo/d(uo + ») (e) T'(T) by the induction hypothesis and dyo/d(uo + pwesi) (e) 
T‘(T) by Corollary 3. Lemma 9 may then be applied to {uo , #, Mesa} and 
yields the desired conclusion. 

LeMMA ll. Jf {uo, wi, ue2--+} %@ @ sequence of measures on S such that 
7s ui(X) < x; 7f, for every Ein S, p(E) = > a ui(E); and if X is a measure 
S such that yp; «K \ fori = 0,1, 2, --- , then 


lim , d( doi wi) /dd = du/dd [r\. 


Proor. Since 0 S d(do%-oui)/dd = Doky (dui/dd) < du/dd [A], the se- 
ries 7 (du;/dd) does indeed converge to a measurable function f [A]. Since, 
for every E in S, 





oo d i 0 7 7 
[ra == inf A = dy = i=0 ui(E) = u(E), 


we have f = du/dd [A], as stated. 





17 In view of Theorem 1, Lemma 9 asserts that if 7 is pairwise sufficient for a set M of 
three elements, then 7 is sufficient for Mt. Lemmas 10 and 12 extend this result to finite 
and countably infinite sets 2 respectively. Since every countable set of measures is 
dominated, the final result, Theorem 3, contains all these preliminaries as special cases. 


iS 


), 
of 


ve 
on 


id 


at 
ure 


se- 
ce, 


: of 
lite 
; is 
ses. 


SUFFICIENT STATISTICS 239 


Lemma 12. Jf {uo, uw, we, °**} is a sequence of measures on S such that 
> ui(X) < «, and if, for every E in S, w(E) = > yao wi(E), then 


lim , duo/d( domo ui) = duo/du [yu]. 
If, in addition, T is pairwise sufficient for the sequence {uo, mi, Me.+*+}, then 
duo/du (e) T(T). 
Proor. We have, for k = 0,1, 2,---, 
duo d(do* wi) _ duo 


d(2in0 i) du dp 
If we write \ = uy, then the hypotheses of Lemma 11 are satisfied and, con- 
sequently, the second factor on the left side converges to 1 [u]; it follows that the 
first factor converges to duo/du [yu]. The second assertion of the lemma follows 
from Lemma 10. 

THEOREM 3. A necessary and sufficient condition that T be sufficient for a 
dominated set IN of measures on S is that T be pairwise sufficient for M. 

Proor. The necessity of the condition is obvious. To prove its sufficiency, 
let 9} = {u1, we, ---} be a countable subset of Yt which is equivalent to M 
(Lemma 7), and let uo be an arbitrary measure in 9. Since the sufficiency or 
pairwise sufficiency of 7 remains unaltered if some or all of the measures in IM 
are replaced by positive constant multiples of themselves, we may assume that 
yf wi(X) < ©. If we write, for every E in S, \(E) = > fai ui(E), then the 
pairwise sufficiency of T and Lemma 12 imply that duo/d(u + d) (e) T (1). 
The relation 


duo _ duo _— uo + 2) _ duo ( dy ) 
dy = d(uo + ) dn d(uo + ) \d(uo + A) 





i do ( _ dio ) 
- (uo + d) (uo + d) 
implies that du/dd (e) T'(T); an application of Theorem 1 concludes the proof. 
A comparison of Theorems 1 and 2 and Corollary 3 yields immediately the 
following consequence of Theorem 3. 
CoroLuary 4. A necessary and sufficient condition that the statistic T be 
sufficient for a dominated set M of measures on S is that, for any two measures 
wand vy in M, du/d(u + v) (e) T (1), or, equivalently, du | dv (e) T~*(T). 


10. The value of sufficient statistics in statistical methodology. We gather 
from conversations with some able and prominent mathematical statisticians 
that there is doubt and disagreement about just what a sufficient statistic is 
sufficient to do, and in particular about in what sense if any it contains “‘all the 
information in a sample.” We therefore conclude this paper with a brief 
explanation of a point of view which, while not original with us, has not received 
due publicity. 








240 PAUL kK. HALMOS AND L. J. SAVAGE 


Suppose a statistician 5 is to be shown an observation x drawn at random 
from some sample space (X, S) on which an unknown measure, yu, of a set Mt of 
possible measures obtains, while for the same observation x another statistician 
S is only to be shown the value T(x) of some statistic 7’ sufficient for M. It is 
clear that S is as well off as S; we shall argue that Sis also as well off as 5S. 

Suppose 5 has decided how to use his datum, that, in other words, he has 
decided just what he will do (or, in particular, say) in the event of each possible x. 
His program can then be described schematically by saying that he has selected 
some function f (of the points x) which, without serious loss of generality, may 
be supposed to take real values. Now 5’s only real concern is for the probability 
distribution of f given uy, i.e. for the function ¢ of a real variable c, defined by 


g(c) = u(ix: f@) <c}) = w(Z(c)). 


But can if he wishes achieve exactly the same results as 5, in the following way. 
Let him, on learning the value of T(x), select a real number f, with the aid of a 
‘random machine”’ which produces numerical values according to the known 
distribution function y, defined by 


¥(c) = p(E(c) | T(e)). 


Then, for any » in MM, the probability that J will select a value less than c is 
[ PEO | WuuT*@) = EO) = v0). 


Thus ‘S is at no disadvantage, save for the mechanical one of having to manipu- 
late a random machine, and he may fairly be said to have as much information 


as o. 

As a matter of fact we know of no practical situation in which ¥ would actually 
go to the trouble of using a random machine. There are some situations in 
which he should in principle do so, but in which practical statisticians have not, 
so far as we know, thought it worth while. If, for example, an outcome consists 
of a sequence of » heads and tails resulting from n spins of a coin the heads 
ratio of which is known to be either one half or one quarter, then a sufficient 
statistic is the number of heads which occur in the sequence. In basing a 
decision on the outcome of this program both 5 and, to a still greater extent, 
SS have (according to Wald’s theory of minimum risk) something to gain by 
recourse to a random machine. ‘There are, on the other hand, many technical 
desiderata which sufficient statistics meet exactly without recourse to random 
machines. Thus, as Blackwell has shown,” if S has an unbiased estimate, R, 
of some parameter, S can find a function R*, defined by R*(y) = e(R | y), 
which is an unbiased estimate of that parameter, with variance not greater than 
that of R. More generally, if R is any estimate with finite mean square deviation 
from a parameter, then it is easy to show with Blackwell’s methods that R* 


18D. Blackwell, “Conditional expectation and unbiased sequential estimation,” Annals 
of Math. Stat., Vol. 18 (1947), pp. 105-110. 


wm eo YS rv 


<< ‘SNS ee 


u- 
On 


lly 

in 
ot, 
sts 
ds 
ent 
r a 
nt, 
by 
ical 
om 

R, 
y)s 
han 
‘ion 


kR* 


nals 


SUFFICIENT STATISTICS 241 


has no larger a mean square deviation than R. Finally it is a well known fact 
that, under suitable hypotheses, if there exists a maximum likelihood estimate R 
of some parameter, then R depends only on y. 

We think that confusion has from time to time been thrown on the subject 
by (a) the unfortunate use of the term “sufficient estimate,’ (b) the undue 
emphasis on the factorability of sufficient statistics, and (c) the assumption 
that a sufficient statistic contains all the information in only the technical 
sense of “‘information”’ as measured by variance. 








ON DESIGNING SINGLE SAMPLING INSPECTION PLANS 


By Frank E. Grusss 


Ballistic Research Laboratories, Aberdeen Proving Ground, Md. 


1. Summary. In designing single sampling inspection plans, a problem is to 
find the acceptance number, c, and the smallest sample size, n, such that if the 
fraction defective of the material inspected is equal to an acceptable value, p; , a 
large percentage, say, 95% of such lots will be accepted under the sample criteria, 
whereas if the fraction defective of the material inspected is objectionable and 
equal to pe (where p: < po), then a large percentage, say, 90% of such lots will 
be rejected. A solution to this problem for the case where the lot size is large 
compared to the sample size is given in this paper and tables are provided for 
quick determination of the sample size n and acceptance number c. 


2. Introduction. In sampling inspection of material one practice is to set an 
acceptable quality level = p,, say, such that the consumer desires to accept 
practically all—95% or more—of lots of fraction defective p; or less (and hence 
desires to reject at most a maximum of about 5% of lots which are of quality 
p: or better) and to set also an objectionable fraction defective = pe , say, which 
represents quality so poor that the consumer cannot afford to accept more than 
about 10% or less of lots of this quality or poorer.’ From the standpoint of the 
producer, he should have very few rejections, 5% or less, for his submitted lots 
the fractions defective of which are equal to or better (less) than p; , whereas 
he should be willing and also expect to suffer increasingly more rejections if his 
process average percent defective departs from the acceptable quality level p; 
toward poor or objectionable quality. In this connection, if we are given p; an 
acceptable quality level, p2 an objectionable percent defective, the risk a = 5% 
of rejecting a lot of fraction defective p: , and the risk 8 = 10% of accepting a 
lot of the objectionable fraction defective p2 , a problem of importance in single 
sampling inspection is to find the smallest sample size n and the acceptance 
number c which will approximate closely the protection stated above. Due to 
the discrete nature of n and ¢, it is not usually possible to find n and ¢ such that 
precisely the above protection is guaranteed; however, it is possible to pick that 
single sampling plan which, for all practical purposes, gives the desired protection, 
i.e. it is possible to select that single sampling plan which more nearly satisfies 


1 When this paper was first presented for publication, the percent defectives p; and pez 
were labeled ‘‘Acceptable Quality Level’’ and ‘‘Lot Tolerance Percent Defective,’ respec- 
tively. In view of the suggestions of H. G. Romig and H. F. Dodge, strict reference to 
these particular terms have been avoided in order that the percent defectives p: and pz 
would appear in a more generalized form. This recommendation is considered especially 
desirable in view of the fact that Table I and Table II of the paper are percentage points 
of the Binomial Distribution and hence are useful in problems other than that of designing 
single sampling inspection plans. 

242 


PO ee eee Oe Fay lll Cie ieee Olle eS lUlC(“‘iNiC‘iéies|T 


a «ft oh tod te bee mee 


DESIGNING SAMPLING PLANS 243 


the above protection requirements than any other plan. The values of n and c 
can be found simply by looking for an entry in Table I below which is close to 
p; and an entry in Table II close to pe. such that column heading c and row 
heading n in Table I correspond exactly with the respective column and row 
headings in Table II. For the sample sizes n, acceptance numbers c and quality 
levels p covered in Tables I and II, the above procedure makes unnecessary any 
computation of or any approximation to the sample size and acceptance number. 
It will be noticed, however, that usually the proper choice of c is clear whereas 
some slight judgment may be necessary in selecting n. 

It is remarked also that Tables I and II solve the equivalent problem of 
finding n and c in connection with testing the hypothesis Hp that the fraction 
defective of the Binomial population sampled is p; or less as against an alternative 
hypothesis H, which states that the fraction defective of the lot, population, 
process, etc., sampled is p2 or greater (p2 > pi), Where a = .05 is the maximum 
risk of erroneously rejecting Hp when it is true and 8 = .10 is the maximum risk 
of erroneously accepting Hy when the alternative H, is true. 

The solution to the problem of finding an appropriate single sampling plan 
in this paper is given by solving the infinite case, i.e. by assuming the lot to be an 
infinite Binomial population. In practice lots are of finite size. However, 
it is well known that Binomial probabilities (infinite universe) give excellent 
practical approximations to Hypergeometric probabilities (finite lot) provided 
the sample size is only a small percentage of the lot size. Hence, the reader is 
warned in using the tables for sampling inspection problems that the lot size 
should be at least 10 or 15 times the sample size. 


3. Basis for construction of Table I and Table II. It is well known that if 
P(c, n, p) represents the probability of obtaining c or less defectives in a random 
sample of size n from a Binomial Population of fraction defective p, then the 
relation between P(c, n, p) and the Incomplete Beta Function Ratio is given by 


(1) P(e,n, p) = h_p»~r~—e,c+ 1) = m—e-l(y — y)° de. 


1 os 
iwraeen | ” 


Consequently, using a table of percentage points for the Incomplete Beta 
Function (1), values of p; can be found for Table I such that 


P(c, n, pi) = .95, 
and values of p2 can be found for Table II presented at the end such that 
P(c, n, pe) = .10. 
Alse, Table I and Table II can be computed by using percentage points of the 
F-distribution (2). Upon making the transformation 
in 2(n — c) 
—  Wn—e)+22e+ DF 








244 FRANK E. GRUBBS 


in (1) above to the /-distribution, we obtain easily that 

1 eo 

(2) P(e,n, p) = ———~ —~; [Ze + 1) [2in — oe)" F’ 
Ble + 1,2 — Cc) Jcn—opscetirg 

[2(n — c) + 2(e + 1)F]"” GF, 


where g = 1 — p. 
With the aid of a table of percentage points of the F-distribution (2), we may 
determine for various combinations of n — cand c + 1 those values of p such that 


P(c,n, m1) = .95 for Table I; 
and 

P(c, n, po) = .10 for Table IT. 
In fact, if P(c, n, p) = a, then 


(n — c)p y " i 
coonwoeee a i. , , 2 — :. 
* Ng F{2(c + 1), 2(n c)} 


or 


_ (e+ IF a{2(c + 1), 2(n — ©)} - 
~ (w— 0) + + I)Fal2(e + 1), 2(n — ©)}’ 





p 


for which relation values of p; for a = .95 are given in Table I below and values of 
pe for a = .10 are given in Table II. 

Although the 95% points are not given directly in (2), they are easily obtain- 
able from the relation 


F os, yo) = — i vsics : 
fi 05 (V2 > Y1) 

Interpolation was required for the great majority of the entries in Tables I 
and II. The values given were obtained by harmonic or linear interpolation 
using References [1] and [2] and are believed accurate to within one unit in the 
last place. 

It will be noticed that if the chosen acceptable quality level, p: , is greater 
than the appropriate tabulated value in Table I for the single sampling plan 
(n, c), then the operating characteristic curve will pass below the point (p; , .95). 
That is, the risk of rejection under the sampling plan for lots of fraction defective 
p: Will be somewhat more than 5%. On the other hand, if a selected acceptable 
quality level p; is less than the appropriate entry in Table I, the risk of rejection 
for 2 product of fraction detective p; will be less than 5%. Similar considerations 
apply also to the fractions defective, po, in Table IT. 


4. Single sampling plans based on the Poisson approximation to the binomial. 
Tables I and If are useful for determination of a single sampling plan when the 


DESIGNING SAMPLING PLANS 245 


desired percent defectives are listed and n does not exceed 150. Table III 
is particularly useful in designing a single sampling plan when we are interested 
in fractions defective not greatcr than about .10. A somewhat similar procedure 
has already been suggested by Peach and Littauer [3]. If we designate by 
P(c, a) the sum of individual Poisson probabilities, 


c — ™m 


P(c,a) = > a ’ 


' 


m=0 ™! 
then Table III gives values a; = np; of a for which 
P(c, a) = .95 
and values dz = npe of a for which 
P(c, a) = .10. 


Hence, to find the single sampling plan whose operating characteristic curve 
passes nearly through the points (p,, .95) and (pe, .10) one merely divides 
values of a; in Table III for various values of c by the acceptable quality level p; 
and divides values of a2 in Table III by the objectionable percent defective pe . 
Then the acceptance number c is picked for which a;/p; most nearly equals a2/p2 
and the approximate sample size n may be determined by rounding to an integer 
the average of the two approximately equal numbers a;/p; and a2/pe . 


5. Example on the use of Tables I, II, II. Given an acceptable percent 
defective or quality level of .01 and an objectionable quality level of .10, it is 
desired to find the single sampling plan which will accept 95% of product which is 
of quality p: = .01 and which will reject 90% (or accept only 10%) of product of 
quality p, = .10. Looking in Table I for entries p; which approximately equal 
.01 and in Table II for entries pz which approximately equal .10 such that the 
ec and n of Tables I and II correspond, we see that c must be equal to 1 whereas n 
may take possibly any one of the values 35, 36, 37, 38. In this connection, we 
have to set up some criteria for the choice of n. Although any of several criteria 
may be used, a reasonable criterion appears to involve picking n such that the 
sum of the absolute departures of the Operating Characteristic Curve from the 
risks a = .05 at pp and 6 = .10 at poisaminimum. This may be determined by 
using appropriate tables of Binomial Probabilities or by computing at p; and p» 
the chance of obtaining c or less defectives in n for the various possible combina- 
tions of c and n. If the above criterion were applied to the present example, 
the combination ¢c = 1 and n = 37 would be selected, i.e. the single sampling 
plan would be c = 1, n = 37. For this sampling plan, the probability of passing 
at p, = .O1 is .9471 and the probability of passing at p. = .10 is .1036. For 
the sake of expediency, another proposal would be merely to select somewhat of a 
“middle” value of n especially when the variation in sample size is slight. 

If we use Table III for the above example, we can select n and c with the aid 








246 FRANK E. GRUBBS 


of the following simple tabulation: 

















| . 
n | ~ 
| 0 1 2 3 
aia SS pe = — —- CC_—— 
WAY Tic descnccsdcccoscsecseveccws i 3.1 39.0 $1.8 136.6 
a iinicedinsniaieiislinasnnneiee | 23.0 38.9 | 53.2 66.8 
Since the sample sizes “cross” at c = 1, we would select c = 1 and n = 1/2 


(35.5 + 38.9) = 37.2 orn = 37. 

A use of Table I of some practical importance is in determining at a glance 
those values of p for which the probability of obtaining c or less defectives in a 
sample of n is equal to .95. As a matter of fact, a series of tables similar to 
Table I and Table II for which P(c, n, p) = .99, .95, .90, .10, .05, .01 ete. would 
be of considerable practical use. 


Acknowledgment. The author is indebted to Miss Helen J. Coon for carrying 
out the computations for the tables. 


RS 


TABLE I 
Values of p = gx such that P(c, n, pi) = .95 


















































c 
2 |— n 
| 0 1 | 2/3 [4/5] 6 ]7/] 8] 9 | 
1 |.0500 | | Pf ff Pg 
2 |.0253 .224 | | | | | | 2 
3 |.0170 |.135 |.368 | | 3 
4 |.0127 |.0976 |.249 |.473 | | 4 
5 |.0102 |.0764 |.189 |.343 | .549 | | 5 
6 |.00851 |.0628 |.153 |.271 |.418 .607 | | 6 
7 |.00730 |.0534 |.129 |.225 |.341 |.479 |.652 | g 
8 .00639 |.0464 |.111 |.193 |.289 |.400 |.529 .688 | | 8 
9 .00568 |.0410 |.0978 |.169 |.251 |.345 |.450 |.571 |.717 | | 9 
10 |.00512 .0368 |.0873 |.150 |.222 |.304 |.393 |.493 |.606 |.741 | 10 
11 |.00465 |.0333 |.0788 |.135 |.200 |.271 |.350 |.436 |.530 |.636 | 11 
12 |.00427 |.0305 |.0719 |.123 |.181 |.245 |.315 |.391 |.473 |.562 | 12 
13 |.00394 |.0281 |.0660 |.113 |.166 |.224 |.287 |.355 |.427 |.505 | 13 
14 .00366 |.0260 |.0611 |.104 |.153 |.206 |.264 |.325 |.390 |.460 | 14 
15 |.00341 |.0242 |.0568 |.0967 |.142 |.191 |.244 |.300 |.360 |.423 | 15 
16 |.00320 |.0227 |.0531 |.0903 |.132 |.178 |.227 |.279 |.333 |.391 | 16 
17 .00301 |.0213 |.0499 |.0846 |.124 |.166 |.212 |.260 |.311 |.364 | 17 
18 .00285 |.0201 |.0470 |.0797 |.116 |.156 |.199 |.244 |.291 |.3841 | 18 
19 |.00270 |.0190 |.0445 |.0753 |.110 |.147 |.188 |.230 |.274 |.320 | 19 
20 .00256 |.0181 |.0422 |.0714 |.104 |.140 |.177 |.217 |.259 |.302 | 20 
| | | | 
21 .00244 |.0172 |.0401 |.0678 |.0988).132 |.168 |.206 |.245 |.286 | 21 
22 .00233 |.0164 |.0382 |.0646 |.0941|.126 |.160 |.196 |.233 |.271 | 22 
23 .00223 |.0157 |.0365 |.0617 |.0898].120 |.152 |.186 |.222 |.258 | 23 
24 00213 |.0150 |.0350 |.0590 |.0859|.115 |.146 |.178 |.212 |.246 | 24 
25 .00205 |.0144 |.0335 |.0566 |.0823].110 |.139 |.170 |.202 |.236 | 25 
26 .00197 |.0138 |.0322 |.0543 |.0790).106 |.134 |.163 |.194 |.226 | 26 
27 .00190 |.0133 |.0310 |.0522 |.0759|.101 |.129 |.157 |.186 |.217 | 27 
28 |.00183 |.0128 |.0298 |.0503 |.0731|.0977|.124 |.151 |.179 |.208 | 28 
29 |.00177 |.0124 |.0288 |.0485 |.0705|.0942|.119 |.145 |.172 |.200 | 29 
30 |.00171 |.0120 |.0278 |.0469 |.0681|.0909|.115 |.140 |.167 |.193 | 30 
| 
31 |.00165 |.0116 |.0269 |.0453 |.0658|.0878).111 |.135 |.161 |.187 | 31 
32 |.00160 |.0112 |.0260 |.0438 |.0637|.0850).107 |.131 |.155 |.180 | 32 
33 .00155 |.0109 |.0252 |.0425 |.0617 .0823).104 |.127 |.150 |.175 | 33 
34 |.00151 |.0106 |.0245 |.0412 |.0598).0798).101 |.123 |.146 |.169 | 34 
35 |.00146 |.0102 |.0238 |.0400 -0580| .0774|.0978| .119 141 |.164 | 35 


247 








TABLE I—Continued 


c 


oe ao 3 4 5 6 7 8 9 


36 |.00142 |.00996'.0231 .0389 |.0564 .0752 .0950 ..116 .137 |.159 | 36 
37 |.00139 |.00969, .0225 , 0378 .0548).0731:.0923..112 .133 .155 > 37 
38 |.00135 |.00943).0219 | .0368 | .0533).0711 .0898, .109 .130 .150 38 
39 '.00131 |.00919,.0213 |.0358 |.0519 .0692 .0874'.106 |.126 .146 | 39 
40 '.00128 |.00896 .0208 .0349 | .0506).0674 .0851..104 .123 ..142 | 40 


41 .00125 |.00874 .0202 |.0340 | .0493).0657,.0830).101 |.120 |.139 | 4! 
42 .00122 |.00853).0198 |.0332 |.0481).0641 .0809,.0985 .117 |.135 | 42 
43 |.00119 |.00833).0193 |.0324 |.0470).0625 .0790,.0961;.114 |.132 43 
44 .00117 .00814'.0188 |.0317 |.0459 .0611 .0771!.0938).111 |.129 44 
45 |.00114 |.00795 .0184 .0309 |.0448 .0597 .0754'.0917 .109 |.126 | 45 


46.0011! | .00778 .0180 |.0302 |.0438 .0584 .0737) .0896 .106  .123 46 
47 ».00109 .00761,.0176 |.0296 |.0429) 0571 .0720'.0876, .104 .120 | 47 
48 .00107 |.00745 .0172 .0290  .0420).0559).0705).0857 .101 |.118 | 48 
49 .00105 |.00730'.0169 .0284 | .0411!.0547 .0690 .0839 .0993).115 | 49 
50 |.00103 | .00715'.0166 | .0278 .0402' .0536 .0676'.0822 .0972'.113 | 50 


51 |.00101 .00701 .0162 .0272 .0394 .0525 .0662,.0805 .0953!.110 | 51 
52 .000986 .00688'.0159 .0267 | .0387).0515 .0649).0789 .0934|.108 
53 |.000967 .00675 .0156 .0262 | .0379).0505 .0637! .0774 .0916|.106 | 53 
54 .000949 .00662 .0153 .0257 .0372).0495 .0625,.0759) .0898) .104 


55 | .000932 .00650 .0150 | .0252 |.0365 .0486).0613|.0745 .0881|.102 | 55 


56 |.000916 .00638 .0148 |.0248 |.0358'.0477 .0602).0731!.0865!.100 | 56 
57 |.000899 .00627 .0145 .0243 .0352).0468) .0591).0718' .0849|.0984 57 
58 |.000884 .00616:.0142 .0239 | .0346' .0460).0580! .0705 -0834' .0966| 58 
59 .000869 .00606 .0140 .0235 |.0340'.0452 .0570).0693).0820).0949| 59 
60 |.000855 .00595 .0138 .0231 | .0334' .0445, .0561;.0681,.0806!.0933, 60 


61 .000841 .00586).0135 .0227  .0329).0437).0551'.0670).0792).0917, GI 
62 .000827 .00576 .0133 .0223 | .0323 .0430 .0542!.0659).0779!.0902) 62 
63 |.000814 .00567 .0131 .0220  .0318 .0423, 0533 .0648|.0766).0887) 63 
64 .000801 .00558 .0129 .0216 .0313 .0416 .0525'.0637).0754|.0873, 64 
65 .000789 .00549 .0127 .0213 | .0308 .0410 .0516 .0627'.0742).0859 65 


66 000777 .00541 .0125 .0210 |.0303).0403 .0508 .0618!.0730).0846 66 
67 = .000765 .00533 .0123 .0206 | .0299) .0397 .0501 .0608).0719 .0833, 67 
68 .000754 .00525,.0121 .0203 |.0294!.0391 .0493:.0599!.0708,.0820' 68 


69 .900743,.00517 .0120 .0200 ,.0290).0385 .0486 .0590) .0698 .O808, 69 


70 .000733 .00510 .OLI8 .0198 |.0286) .0380 .0479 .0582| .0687).0796| 70 


248 





TABLE I—Continued 








0184 0266] .0354).0446) .0542).0641 .0742 


c 

n —— —-— ————,——_- - — = n 

| 0 1 | 2/3 }4/]686 |}6{|7i,8 i; 9 
71 ".000722|.00503|.0116 |.0195 |.02821.0374'.0472).0573) 0678! .0785 71 
72 000712; .00496) .0115 .0192 |.0278| .0369).0465 .0565|.0668 .0773. 72 
73 .000702).00489'.0113 |.0189 |.0274|.0364!.0459 .0557|.0658).0762, 73 
74 |.000693) .00482'.0111 |.0187 |.0270).0359} .0452|.0549|.0649| 0752) 74 
75 7 75 

| 


| 

| 

| 

| | | 
76 .000675| .00470|.0108 |.0182 .0263| .0349) .0440} .0535| 0632) 0732 76 
77 |.000666' .00463|.0107 |.0179 |.0259| .0345,.0434).0528) .0623|.0722) 77 
|.000657) .00457|.0106 |.0177 | .0256| 0340 .0429).0521|.0615,.0712 78 
79 |.000649) .00452'.0104 |.0175 |.0253|.0336,.0423!.0514|.0607|.0703! 79 


~I 
CO 


80 000641) .00446| .0103 |.017: |.0249| 0332 .0418 .0507.0600| .0694 80 
| | | | | | | 

81 |.000633 .00440'.0102 |.0170 |.0246| .0328 0413) .0501|.0592|.0685) 81 

82 .000625|.00435|.0100 |.0168 |.0243!.0323!.0408).0495|.0585|.0677| 82 





83 |.000618 .00430 00992! .0166 |.0240|.0319) .0403) 0489] .0577|.0668| 83 
84 |.000610|.00425 .00980) .0164 | .0237|.0316| 0398) 04830570) .0660, 84 
85 |.000603 .00420) .00969| .0162 |.0235|.0312 .0393!.0477).0564).0652, 85. 








' 
| 


86 |.000596| 00415) .00957|.0160 |.0232| 0308) .0388| 0471) .0557|.0645| 86 
87 |.000589) .00410| .00946].0159 |.0229|.0305) .0384).0466|.0550, .0637, 87 
88 | 000583} .00405| .00936) .0157 |.0227| .0301].0379| .0460|.0544 0630, 88 
89 |.000576] .00401/ .00925).0155 | .0224! .0298).0375| .0455|.0538 .0622, 89 
90 |.000570|.00396|.00915}.0153 | .0221|.0294).0371!.0450|.0532'.0615! 90 








| 
| } 
| ! } 





| | | | j 

91 |.000564 .00392 .00904|.0152 |.0219|.0291 | .0367).0445).0526 .0608 91 
92 | .000557).00388| .00895).0150 | .0217|.0288|.0363).0440 .0520,.0602) 92 
93 |.000551).00383! .00885|.0148 |.0214).0285).0359 .0435|.0514 .0595, 93" 
94 |.000546).00379) .00875).0147 .0212) .0282|.0355| 0431) .0509 .0589) 94 

95 |.000540;.00375|.00866,.0145 .0210) .0279) .0351).0426| .0503 0582) 95 


| 





| | | | | 

96 |.000534 .00371'.00857) .0144 '-0207| .0276] .0347) .0421) .0498 .0576, 96 
97 | .000529, .00368 .00848 .0142 -0205 0273) .0344 -0417) .0493 | 0570 97 
98 |.000523..00364 .00840) .0141 -0203  .0270) .0340 .0413) .0487 .0564, 98 


99 (.000518 .00360 .00831'.0139 | .0201| .0267'.0337 .0408}.0482).0558 99 
100 |.000513 .00357 .00823 .0138 |.0199:.0265|.0333 .0404!.0478'.0553. 100 





101 .000508 .00353 .00814 .0136 | .0197 .0262,.0330 .0400 .0473).0547 101 
102 |.000503'.00350' .00806,.0135 | .0195,.0259 .0327' .0396) .0468).0542, 102 
103 |.000498 .00346 .00799 .0134 .0193 .0257,.0323: .0392).0463).0536 103 
104 |.000493 .00343 .00791 .0132 .0191 .0254!.0320 .0389' .0459|.0531, 104 
105 |.000488 .00339 .00783 .0131 .0189 .0252 .0317 .0385! .0454).0526' 105 


249 











106 
107 
108 
109 
110 


111 
112 
113 
11+ 
115 


116 
117 
118 
119 
120 


126 
127 
128 
129 
130 


131 
132 
133 
134 
135 


136 
137 
138 
139 
140 


0 


.000484 
.000479 
.000475 
.000470 
000466 


.000462 
.000458 
.000454 
.000450 
.000446 


| 


.000442) 
.000438 
.000435 
.000431 
.000427 


.000424 
.000420 
.000417 
.000414 
. 000410 


.000407 
.000404 
.000401 
.000398 
.000394 


.000391 
.000389 
.000386 
.000383) 


.000369 
.000366 .00254) . 


.00336 
.00333 
.00330 
.00327 
.00324 


.00321| 
.00318 
.00315 
.00313 
.00310 


.00307' 
00305) 
00302 
.00299 
00297 


.00294 
.00292 
-00290 
-00287 
.00285 


.00283 
.00281 
.00278 
.00276 
.00274 


.00272 
00270 
.00268 
00266 
00264! 


| 
00262 
.00260 
00258 
00256 


00741 
.00734| 
00727 
.00721 
.0071 5| 


.00709 
.00702 
.00696 
.00691| 
.00685 


00679 
00674 
.00668 
00663 
00657 


.00652 
.00647 
.00642 
.00637 
.00632 


.00627 
.00622 
.00618 
00613) 
00608) 


.00604; 
.00599} .0100 
00595 


TABLE I—Continued 


.00776, .0130 |.0188 
.00768 .0129 |.0186 
.00761).0127 |.0184 
00754 .0126 |.0182 
00747 .0125 |.0181 


0249) 
.0247| 
0245 
0242 
.0240 


.O314 . 
.O311. 
.0308 
.0305 
.0302 


.0300! 
.0297) 
; 0294, 
0292 
.0289| 


.0179) 
.0178 
|-0176) 
.O174 
.0173 


.0124 
.0123 
.0122 
.O121 
.0120 


.0238 
.0236 
.0234 
.0232) 
.0230 


.0228 
.0226) 
.0224; 
.0222| 
.0220) 


.0119 
.0118 
.O117 
.0116 
0115 


.0171 
.0170 
|.0168 
.0167 
.0166 


.0287) 
0284) 
.0282) 
.0279 

.0277| 
| 
.O114 
.0113 
.0112 
.O111 
.0110 





0275 
.0272 
.0270 
.0268 
.0266 


0218 
.0216 
.0215 
0213) 
0211 


| 0164 
0163) 
0162 
0160) 
.0159 








.0109 
.O108 
.O107 
.0107 
.OLO6 


.O1d58 
0156 
0155 
0154! 
0153) 


.0209 
.0208 
.0206 
.0204 
.0203 


.0264 
.0262 
.0259 
.0257 
.0255 


0152) 
0150) 


.0149 


.O105 
.0104 
.0103 


.0201 
0200 
| .0198) 

-0103 | .0148| .0197).0248 
.0102 —— 
| 

| o101 .0146}.0194) .0244'. 
|-0145] .0192 .0242 
.00996 .0144] .0191) .0240 . 
.00591|.00989' .0143'.0190' .0239 . 
00587) .00982' .0142'.0188 .0237 


.0253 
.0252 
.0250 





250 


.0364 .0430) 
.0360 .0426) 
0357 
.0354) .0418) 
-0351).0414 


0348) .0411) 
.03.45! .0407! 
0342) .0404 
.0339' .0400 
.0336 

| 


.0333) .0394: 
.0330! .0390 
.0328 .0387 
0325 .0384 
.0322; .0381| 


0381 .0450 
0378 .0446 


.0374 .0442 
.0370 .0438' 
.0367 


.0433 


.0422) 


.0397 





.0320 .0378 
.0317 
.O315 
.0312 .0369 
.0310 .0366 


.0375 
.0372 


.0308) .0363 
.0305 
.0303' .0358 
0301! .0355) 
.0298 .0352) 


.0360 


0296 .0350 
0294 .0347 
0292 .0344 
0290 .0342 


.0288 .0339 


.0497 
.0492 
0488 
0484 
0479) 


0475 
.0471 
0467 
.0463 
0459) 


.0521 

.0516) 
.0511| 
.0506| 
.0502 


| 
| 


| 





| 


.0455' 
0451! 
0448) 
.O444 

.0440' 


.0437 
.0433 
.0430 
.0427 
.0423 


0420 
0417 
0414 
0410! 
0407 


0404) 
0401) 
0398 
0395 
0393 


106 
107 
108 
109 
110 


111 
112 
113 
114 
115 


116 
117 
118 
119 
120 


121 
122 
123 
124 
125 


126 
127 
128 
129 
130 





tt toh 


141 


142 
143 
144 
145 


146 
147 
148 
149 
150 


|-000364 
.000361 
.000359 
|-000356 
000354 


|.000351 
|.000349 
|.000346 

000344 


00253. 
.00251 . 
.00249). 
.00247 . 
.00246) . 





.000342 


1 


| 
\- 


00244). 
.00242). 
00241). 
00239). 
.00237| . 


DESIGNING SAMPLING PLANS 


TABLE I—Concluded 


y 4 
« 


00582! .00975|.0141) .0187 
.00968 .0140'. 
.00961 | .0139). 
00954, .0138) . 
00948) .0137). 


00578 
00574 
00570 
00566 


00562 


00559 


00555 


00551 
00547 





c 


3 | 4 


5 


00941) .0136). 
.00935) .0135). 
.00928) .0134'. 
.00922! .0133}. 
.00916| .0132 


0186 
0184 
0183 
0182 


0180) 
0179) 
0178 
0177 
0176 


6 
-0235 . 
.0234'. 
.0232 . 


-0230 . 
.0229 . 


.0227). 
.0226 . 
0224 . 
.0223 
.0221 . 


0285 
0283 
0281 
0279 


0278 


0276 
0274 
0272 
0270 
0268 


8 | 9 


.0337 | .0390 
-0335) .0387 
.0332 .0384 
.0330, .0382 
.0328 .0379 


.0325 .0376 
.0323' .0374 
0321) .0371 
.0319, .0369 
.0317 .0366 


251 


n 


141 
142 
143 
144 
145 


146 
147 
148 
149 
150 





TABLE II 
Values of p = pe such that P(c, n, pe) = .10 


1 | | 4 
2} .684 | .949 | | 2 
3! .536 | .804 | .965 | | | 3 
4| .438 | .680 | .857 | .974 | | | 4 
5| .369 | .584 | .753 | | 5 


6 .319 | .510 | .667 
7| .280 | .453 | .596 
8 .250 | .406 | .538 
Q .226 | .368 | .490 


| 
| .983 
10.206 | .337 | .450 | .552 | .646 


921 | .985 a, 
.853 | .931 | .987 | 
.790 | .871 | .939 .988 9 
.733 | .812 | .884  .945 | .990] 10 


CO 


11; .189 | .310 
12 .175 | .288 
13, .162 | .268 
14, .152 | .251 
15) .142 | .236 


.682 | .759 | .831 | .895 | .951) 11 
.638 | .712 | .781 | .846 | .904} 12 
598 | .669 | .736 | .799 | .858) 13 
.563 | .631 | .695 | .757  .815} 14 
5382 | .596 | .658 | .718 774] 15 


oe 
“I 
qu 
ou 
qu 
© 


we 
f— 
— 
cu 
bo 
CO 








16 .134 | .222 
17; .127 | .210 | .284 


Ww 
we 
oo 


371 | .489 | 504. .565 | .625  .682 | .737] 16 
| 852 | .416 | .478 .537 | .594 | .650 | .703] 17 
IS .120 | .199 | | | .455 | .512 | .567 | .620 | .671| 18 
19 .114 | .190 | .257 | .319 | .378 | .484 | .489 .541 .592 642] 19 
20.109 | .181 | .245 | .304 ; .361 | .415 | .467 | .518 | .567 | .615] 20 


~ 
cae 
S 
A 
2 
ve 
Oo 
pen 
ih 
a 
© 
oe 


104 | .173 


21 234 | .291 | .345 | .397 | .448 | .497 | .544 | .590) 21 
22 .0994| .166 | .224 | .279 .331 | .381 | .480 | .477 | .523 | .568] 22 
93.0953! .159 | .215 | .268 | .318 | .366 | .413 | .459 | .503 | .546] 23 
94 .0915| .153 | .207 | .258 | .306 | .352 | .398 | .442 | .485 | .526) 24 
25.0880! .147., .199 | .248  .295 | .340 , .383 | .426 | .467 | .508| 25 





| 328 





26.0847! .142 | .192 | .239  .284 370 | .411 | 451, .491] 26 
97° .0817! .137 | .185 | .281 | .275 | .317 | .358 | .397 | .436 | .475| 27 
298 .0789' .132  .179 | .223 .265 | .306 | .346 | .3885  .422 | .459) 28 
299 .0763' .128 | .173 | .21G | .257 | .297 | .3835 | .872 | .409 , .445) 29 
30.0739! .124  .168 | .209 .249 | .288 | .325 


.061 | .397 -432| 30 


31.0716 120.163.2038. -.241.-.279 315 | .350 885 , .419] 31 
32 0694 116.138.197.284. .271 | .306 | .340. 374 | .407! 32 
33.0674 .113. .153 .191 | .228 | 263.297. .331 | .364 | .396| 33 
34. .0655 .110 | .149 .186 .221 , .256 , .289 | 322) -.354 | .385) 34 
3 


35 .0637: .107 | .145 | .181 | .216 . .249 | .282 | .313 | .345 | .375) 35 





TABLE II—Continucd 


| c | 
nm | - —--—---- - - ln 
|} oj; 1 | 2 | 3 4 5 6 7 8 9 | 
36| .0620| .104 , .141 | .176 | .210 242.274 305 | .336 | .366) 36 
37| 0603| .101 | .138 | .172 | .205 | 236.267.2908 | 827.357) 37 
38) .0588| .0985) .134 | .167 | .199 | .230  .261 | .290 | .319 348| 38 
39) .0573| .0961| .131 | .163 | .195 | .225  .254 , .283 312 340) 39 
40) .0559| .0938| .128 | .159 | 190 | 220.248.277.805 | .332) 40 
| | | 
| 11] .0546| .0916) .125 | 156 | .186 | .215 242.270.298.324) 41 
42| .0533| .0895| .122 | .152 | .181 | .210  .237 | .264 | .291 | .31%) 42 
: 13| .0521| .0875| .119 | .149 | iz7 | 205 .232 .259 .285 , .310! 43 
44 — 116 | .146 | .174 | .201 | .227 | .253 | .279 | .304) 4 
45] .0499| .0837) .114 | 142 | .170 | .196 | 222.248 | .273 | .297| 45 


AG 488} .0819} .112 | 140 



































| .166 | .192 | .218 | .243 | .268 | .291) 46 
L 47| .0478| .0803| .109 | .137 | .163 | .188 | .213 | .238 | .262 | .285) 47 
2 48| .0468| .0786| .107 | .134 | .160 | .185 | .209  .233  .257 | .280| 48 
3] 49| .0459| .0771| .105 | .131 | .157 | .181 | .205 , .229  .252 | .274) 49 
+ 50| .0450| .0756| .103 | .130 | .154 | .178 | .201 , .224  .248 | .269) 50 
a | | | | 
| 51| .0441| .0741| .101 | .126 | .151 | .174 | .197 | .220 | .243 | .264) 51 
© | 52| .0433| .0728| .0991| .124 | .148 | .171 | .194 | .216 | .239 | .259) 52 
4 53| .0425| .0714| .0973| .122 | .145 | .168 | .190 | .212 | .235 | .255) 53 
8 | 54) .0417| .0701| .0956) .120 | .143 | .165 | .187 | .208 | .230 | .250) 54 
: 55} .0410| .0689| .0939} .117 | .140 | .162 | .184 | .205 | .227 | .246) 55 
56} .0403| .0677| .0923| .115 | .138 | .159 | .180 | .201 | .223 | 219] 56 
571 .0306| .0665| .0907| .113 | .135 | .157 | .177 | .198 | .219 | .238| 37 
= i 58| 0389) .0654] .0892| .112 | .133 | .154 | .175 | .195 | .216 | .234) 58 
= 59| .0383| .0643] .0877| .110 | .131 | .152 | .172 | .191 | .212 | .230) 59 
= GO| .0376| .0633| .0863| .108 | .129 | .149 | .169 | .188 | .209 | .226) 60 
25 | | | 
| G1| .0370| .0623| .0849| .106 | .127 | .147 | -166 | .185 _ .206 | -223 G1 
26 62| .0365] .0613| .0836| 105 | .125 | .145 | .164 | .183 | .208 | .219) 62 
2 63| .0359} .0603) .0823) .103 | (193 | 142 | .161 | .180 | .200 | .216) 63 
28 64] .0353] .0594| .0810| .101 } -121 | 140 | .159 | .177 | .197 | “o13l 64 
65| .0348] .0585 0798) 0999) .119 | 138 | .156 174 | .194 | .210| 65 
, | | 
. 66| .0843| .0577| .0786| .0984) .117 | 186 | .154 | .172 | .191 | .207, 66 
- 67) .0338| .0568| .0775, .0970| .116 | .134 | 152.169 | .188 | .204) 67 
- 68, .0333| .0560| .0764) .0956) .114  .132 | .150 | .167 | .185 | .201) 68 
| 33 69} .0328| .0552) .0753) .0943| .113 | .130 | .148 | 165 | .182 | .198) 69 
ew 70, .0324) .0544} .0743! .0930] .111 | .128 | .146 | 162.179 | .195 70 


253 








“Iss 


“J 
Cok Whe 


1s] 
or 


2 + 
QO = 


~] 
on 
—_ 


SU 


81 
82 
83 
84 
85 


86 
87 
88 
89 
90 


91 
92 
93 
94 
95 


96 
97 
98 
99 
100 


101 

102 
103 
LO4 
105 


( 
J | 


0319] 
0315 
.0310) 
. 0306} 
9302) 


.9298 
0295 
.029] 
.G287 
0284 


0280, 
.0277 
0274 
0270 


.0267 


.0264 
.0261 
.0258 
.0255 
0253) 


.0250! 
0247 
0245 
.0242 


.0239 


.6237 
.0235 
.9232 
.0230 
.0228 


.0225 
.0223| 
.0221/) 
.0219 
.0217 


1 


.0537 
.0530 
.0522 
0516 
.0509 


.0502 
0496 
.0490 
0483 
.O478 


.0472 
.0466 
.0461 

0455 
.O4 50) 


0445) 
.0440 
.0435 
.0430 
.0425 


0421 
0416 
0412 
0408 
0403 


.0399 
.0395 
.0391 
.0387 
.0383 


-0380 
.0376 
.0372 
.0369 
.0365 


9 


0732 


0722 
.0713| 
.0703 


-O694! 


0685 
.0676 
.0668 
0660 
0652 


0644 
.0636 
.0629 
.0621) 
.0614! 


.0607 
.0600 
0594 
-0587) 
.0581| 


0574! 
.0568 
0562 
0556 
0551) 


0545 
.0539 
.0534 
.0529 
0524 


.0518 
.0513 
.0508 
.0504 


.0499 


TABLE II—Continued 


.0625 








c 
3 | 4 

.0917, .109 
0904 .108 
.0892) .107 
0881} .105 | 
.0869| .104 
0858, .102 
.0847) .101 
.0836) .0999 
.0826, .0987| 
.0816, .0974 
0806) .0963 
.0797| .0951) 
.0787) .0940 
.0778| .0929 
.0769| .0918 
.0760| .0908 
.0752, .0898 
.0743| .0888 
0735 .0878 
.0727| .0869 
0719 .0859 
0712) .0850 
0704 .0841) 
.0697, .0832 
.0690, .0824) 
0683. .0815 
0676, .0807 
0669 .0799 
0662 .0791 
0656 .0784) 
0650, .0776 
.0643) .0768) 
.0637, .0761| 
.0631| .0754 


.0747) 


254 


.0865 


5 6 
127 | .144 
.125 | .142 
.123 | .140 
.122 | .138 
.120 | .136 
119 | .134 
117 | .133 
116 | .131 
.114 | .130 
.113 | .128 
111 | .126 
110 | .125 
.109 | .123 
.108 | .122 
.106 | .121 
.105 | .119 
104 | .118 
.103 | .117 
.102 | .115 
.101 | .114 
.0995| .113 
.0985) .112 
.0974) .110 
0964, .109 
.0954, .108 
.0945| .107 
.0935) .106 
0926, .105 
.0917| .104 
0908) .103 
.0899 .102 
.0890) .101 
0882, .100 
.0874| .0991 


. 133 
.132 
. 130 
.129 


127 


.126 
. 125 
.123 
. 122 
.121 


. 120 
.118 
-117 
.116 
-115 


.114 
.113 
.112 
-111 


.0981 .110 


. 147 


. 143 
. 142 
. 140 


.139 
137 

136 
135 
.133 


132 
131 
.129 | 
. 128 
127 


ad 


nr 
9 
193) 71 
190) 72 
.188} 73 
185) 74 
183] 75 


152) 91 
.150) 92 
.148) 93 
.147| 94 
145) 95 


.144) 96 
143) 97 
.141| 98 
.140| 99 
138/100 





. 137/101 
.136|102 
.134|103 
.133\104 
.132,105 





116 
117 
118 
119 
120 


121 
122 
123 
124 
125 


126 
127 
128 
129 
130 


131 
132 
133 
134 
139) 


136) 
137) 
138) 
139 
140 





.0200 
5| .0198 


.0197 
0195 
.0193 
.0192 
.0190 


.0189 
.0187 
.0185 
.0184 
.0183 


.0181 
.0180 
.0178 
.0177 
.0176 


.0174 
.0173 
.0172 
.0170 
.0169 


.0168 
.0167 
.0165 
.0164 


.0163 





.0294 
0291) 
0289) 
.0287 
0285) 








.0318 
.0315 
.0313 
.0310 
.0308 


.0305 
.0303 
.0301 
.0298 
.0296 





.0283 
.0281 
.0279 
0277 
.0275 


| 0472 
3, .0468 
0464 
.0460 
.0456 


-0401 
.0398 
.0395 
.0392 
.0389) 


9 


~~ 


_ .0494 
| .0490 
| 0485 
0481 
.0477 


-0619 





.0452 
.0449 
.0445 
.0441 
.0437 


.0434 
.0430 
.0427 
.0424 
.0420 


.0417 
.0414 
.0410 
.0407 
.0404 





-0387| 
.0384 
.0381| 
.0378 


.0376 


3 


0614 
.0608 
.0603 
0597 
0592 
0587 
0582 
.0577 
0572 


.0567 
.0562 
.0557 





.0553 
.0548 


0544 
.0539 
.0535 
.0531 
.0527 


.0523 
.0519 
.0515 
.0511 
.0507 


.0503 
.0499 
-0495 
.0492 
.0488 


0485 
.0481| 
.0478 
0474) 
.0471) 


c 


4 | 


.0740 
.0733 
.0727 
.0720 
.O714 


0707 
.0701| 
0695 
.0689 
.0683| 


-0677 
.0672 
-0666 
.0661 
-0655 


-0650 
.0645 
.0639 
.0634 
.0629 


.0624 
.0620 
-0615 
-0610 
-0606 


.0601 
.0596 
.0592 
.0588 
.0583 


0579 
0575 
0571 
0567 


.0563) 


0857 
.0850! 
0842! 
0834 
.0827 


-0820) 
0812 
.0805 
.0798 
.0792 


.0785 
.0778 
.0772 
-0765 
.0759 


-0696 
.0691 
-0686 
.0681 
.0676 


.0671 
.0666) 
.0662 
0657) 
.0652 


TABLE II—Continued 





5 


.0972 
0964 
0955 
0946 
.0938 


.0930 
.0921| 
0913) 
.0906 
.0898 





.0753 
.0747 
.0741 
.0735 
.0729 


.0724 
.0718 
.0713 
.0707 
.0702 





0790) 
0784 
.0778| 
0773) 
0767 


6 | 





.0890 
.0883 
.0875 
.0868 
.0861 


.0854 
.0847 
.0841 
.0834 
.0827 


.0821 
.0815 
.0808 
.0802 
.0796 





.0762 
.0756 
0751 
0745 
0740. 


109 | .120 
.108 | .119 
.107 | .118 
106 | 
.105 | .115 


.0994/ .110 

.0986| .109 
.0977| .108 
.0969) .107 
-0961| .106 


.0954| .105 
.0946| .104 
.0938| .104 
.0931| .103 |. 
0924} .102 


.0917| .101 
.0909) .100 
.0902) .0995 
-0896| .0988 
.0889} .0980 . 


.0882) .0973 
0876! .0966 
0869) .0959 
.0863| .0952 
.0857| .0945 . 


.0850 .0938 
.0844) .0931 
.0838, .0925 
.0832 .0918 


~I 
0 | 


i 
— 
o> 


104 | .114 
.103 | .113 
.102 | .112 
101 | .111 
.100 | .111 





| 


0826 .0912 








-~ 


255 





.120 
119 
.118 
1.117 
|.116 

115 


.114 
113 


112 


-1ll 


.110 
.110 
. 109 
.108 


107 


. 106 
.105 
.105 
. 104 


103 


.102 
.102 
.101 
.100 
£9996 140 





.131 |106 
130 
128 
127 


. 126 


107 
108 
1109 
110 


111 


4 {112 

3/113 

2 114 

1/115 
| 


1116 
117 
1118 
1119 
1120 


121 
1122 
1123 
1124 
125 


1126 
1127 
128 
(129 
(130 


131 
132 
1133 
134 
135 


136 
137 
138 
139 


FRANK E. 


TABLE —Conebuded __ 


GRUBBS 








Cc 
. oe ee 
0 1 2 3 4 5 6 7 8 9 

141) .0162; .0273. .0373) .0468! .0559 .0648! .0735 .0821) .0905).0989/141 
142.0161, .0271 .0370, .0464| .0555) .0643) .0730 .0815, .0899).0982|142 
143) .0160, .0269 .0368 .0461 .0551. .0639 .0725, .0809 .0893'.0975/143 
144.0159) .0267 .0365 .0458 .0547 .0635| .0720, .0804 .0887).0969|144 
145.0158) .0266 .0363 .0455 .0544 .0630) .0715' .0798 .0881).0962!145 
146.0156, .0264' .0360, 0452) .0540 .0626 .0710 .0793 .0875 .0956,146 
147.0155 .0262 .0358 .0449 .0536, .0622 .0705 .0788 .0869).0949|147 
i148 .0154| .0260 .0356 .0446, .0533' .0618 .0701 .0783, .0863 .0943|148 
149 .0153) .0259 .0353 .0443 .0529 .0614\ .0696' .0777. .0858' .0937/149 
150.0152 .0257 .0351 .0440 .0610 .0692: .0772 .0852). 


Acceptance Number 


| Values of a; = np; for which | Values of a2 = np2 for which 


0526 


TABLE III 
(Based on Poisson approximation to the binomial distribution) 


0931/150 


Pt, @) = 95 (c, a2) = .10 
0 | 05129 2.303 
1 | 3554 3.890 
2 | .8177 5.322 
3 | 1.366 6.681 
4 1.970 7.994 
5 2.613 9.275 
6 | 3.285 10.53 
7 | 3.981 11.77 
8 | 4.695 12.99 
9 5.425 14.21 
10 | 6.169 15.41 
11 6.924 16.60 
12 " 7.690 17.78 
13 8.464 18.96 
14 9 246 20.13 
15 10.04 21.29 


—~ Tf = ot 


REFERENCES 
{1} CatrueRINE M. Tuompson, ‘‘Tables of percentage points of the incomplete beta func- 
tion,’”? Biometrika, Vol. 32 (1944), pp. 151-181. 
[2| Maxine MERRINGTON AND CATHERINE M. Tuompson, ‘Tables of percentage points 
of the inverted beta (/) distribution,’’ Biometrika, Vol. 33 (1945), pp. 77-88. 
[3] PauL Peacu ANp 8. B. Lirrausr, ‘A note on sampling inspection,’? Annals of Math. 
Stat., Vol. 17 (1946), pp. 81-84. 





ON THE RANGE-MIDRANGE TEST AND SOME TESTS WITH BOUNDED 
SIGNIFICANCE LEVELS! 


By Joun E. Watsu 
The RAND Corporation 


1. Summary. This paper is divided into two parts. The significance tests 
investigated in Part I concern the population mean and are based on the quantity 


{(sample midrange)-(hypothetical mean)]/(sample range). 


The case in which the observations are a sample from a normal population is 
considered in detail. The tests investigated are summarized in Table 1. These 
tests are found to be very efficient for small samples (see Table 4, power efficiency 
is defined in section 3). An investigation of several extremely non-normal 
populations using the values of Da obtained for normality indicates that the 
significance level of the range-midrange test is not very sensitive to the require- 
ment of normality for small samples (see Table 6). Also the tests of Table | 
can be applied without computation through the use of an easily constructed 
graph (see section 4). These properties suggest that the range-midrange test is 
preferable to the Student t-test and the analogue of the Student t-test using the 
sample range (see [1] and [2]) whenever the sample size is sufficiently small. 

Use of the range-midrange test for the case of normality was proposed by E. 8. 
Pearson in [3], where properties of the test were experimentally investigated 
for the normal and certain non-normal populations. 

In Part II several significance tests for the mean are developed which have a 
specified significance level for the case of a sample from a normal population 
but whose significance level is bounded near the specified value under very 
general conditions, one of which is that the observations are from continuous 
symmetrical populations. Some of these tests are range-midrange tests. Table 
2 contains a summary of the tests and their properties (x; = 7th largest observa- 
tion, 7 = 1, --- , n; conditions (D) are given in section 7). 


PART I. THE RANGE-MIDRANGE TEST 


2. Introduction. In 1929 E. 8S. Pearson proposed using the range-midrange 
test for the case of a sample from a normal population (see [3]) and experi- 
mentally investigated some of its properties for sample sizes of 5 and 10 and 
significance levels of 2% and 10% (symmetrical tests). Using the constants 
(corresponding to the D, in this paper) determined for the case of normality, 

1 This paper was presented to a joint meeting of the Institute of Mathematical Statistics 
and the American Mathematical Society at New Haven, Conn. in September, 1947. The 
results presented in this paper were obtained in the course of research conducted under the 
sponsorship of the Office of Naval Research. This research was performed while the 
author was at Princeton University. 


257 








258 JOHN E. WALSH 


significance level and power function properties of these four tests were experi- 
mentally investigated for several non-normal populations. The results of this 
empirical investigation indicated that the range-midrange test is very efficient 
for normality and not very sensitive to the assumption of normality if the sample 
size is sufficiently small. 

This paper presents an analytical investigation of properties of the range- 
midrange test for n = 2, 3, --- , 10 and a wide range of significance levels. 
The results of this investigation confirm the contention that the range-midrange 
test is very efficient for normality and small samples; also an analytical investiga- 
tion of how the significance level changes for the case of certain extremely 
non-normal populations furnishes results which agree with the contention 
that the range-midrange test is not very sensitive to the requirement of normality 
for sufficiently small samples. 

In most cases the results presented in this paper are not directly comparable 
with those obtained by Pearson. It was possible, however, to obtain values of 
Da, (a = 5%, 1%; n = 5, 10), from the results presented in [3]; these values 
were found to be in close agreement with the corresponding values of Table 5. 


3. Efficiency of range-midrange. The purpose of this section is to use the 
relations derived in section 6 to determine the power efficiencies of tests A, B 
and C (see Table 1) fora = 1%, 5% andn = 2,--- , 10. Todo this the method 
of defining power efficiency given in [4] and [5] will be used. As shown in [5], 
it is sufficient to consider only test A; for any fixed n and a, tests A, B and C all 
have the same power efficiency (note that the significance level of test C is 2a). 

For a normal population (unknown variance) the most powerful test of the 
one-sided alternative 1 < yo is the appropriate Student t-test. The procedure 
used in determining the power efficiency of test A consists in first computing the 
power function of test A for the given values of n and a; then the sample size 
of the corresponding Student t-test at this significance level is varied until the 
power function of the t-test is approximately equal to that of test A. The size 
sample (not necessarily integral) thus obtained for the t-test divided by n is 
called the power efficiency of test A for the given values of n anda. Intuitively 
the power efficiency of a test measures the percentage of the total available 
information per observation which is being utilized by that test. 

Table 3 contains values of the power function for test A. These values were 
computed from equation (3) of section 6 by approximate integration. 

The corresponding values of the power function for the Student f-test were 
found by using the normal approximation given in [6]. This approximation 
was used for fractional degrees of freedom. The sample sizes considered as 
well as the resulting power function values are listed in Table 3. A comparison 
of the power function values for the two types of tests furnishes the approximate 
power efficiencies listed in Table 3. 

For n = 2, test A is itself a Student ¢-test. The power efficiency is therefore 
100% for that sample size. This combined with Table 3 furnishes power 


RANGE-MIDRANGE TEST 259 


efficiencies at the 1% level for n = 2, 6,8, 10 and at the 5% level for n = 2, 6, 10. 
The approximate power efficiencies given in Table 4 for other values of n were 
obtained from these values by graphical interpolation. 

Table 4 shows that the power efficiency for a = 1% is very good for n < 8, 
while for a = 5% the efficiency is good for n < 6. 


TABLE 1 
Summary of range-midrange tests 





a Signifi- 
Definitions | cance 
Accept If | Level 
Test based on sample of size n, (2<n< (A) 
10), from an arbitrary normal popula- 
tion. 
ada athnase hak tank hag aay | <M D<—-D, a 
a; = smallest sample value. 
t, = greatest sample value. | | 
| 
uw = the mean of the normal population. (B) 
uo = given hypothetical mean value to be | 
tested. | | 
eh desis Da cena a | p> po D>Da. a 


p = (sample midrange)-(hypothetical mean) | 
- (sample range) | | 
[(a@n + %1)/2 — pol/(Xn — 11). ----— --- —_— 


D, = constant depending on n and a. | 
Values of a versus D, for 2<n<10 and UA \D|>Dz 2a 
a = 5%, 2.5%, 1%, 0.5% are given in 
Table 5. 


4. Construction of graph. In most problems to which a test of the type 
developed in this paper would be applied, the values of the sample can be con- 
sidered to have practical lower and upper limits, say a and b. For example, in 
many situations zero is a lower limit for the sample values. From a practical 
viewpoint these limits on the sample values do not contradict the assumption 
that the population is normal, since the area under that part of the normal 
distribution which lies outside the interval (a, b) can be considered negligible. 
Thus, since Pr(u/v° S$ w) = Pr(u S$ v’w), test A can be restated in the form 

Accept uw < wo if the sample point (a1, x.) falls in the region (A) of the a1, 2, 





cE. WALSH 


I 


JOHN 


8°0 


| £°0 


tht 
oO 


80 | 91 


Sei STisgei sola 





IOMO'T 2 Q AMT saddy 


Al 
-[eul0U : 

Joy SISA, "JOUNUAS $389], POpls-] 
-A.uUaI9 

ol (qj) suotIpuod 10j 
‘xoiddy | spunog jaao] aueroytusis 


I "0 
‘xouddy | ‘xoaddy 


I g°0 


G [ 
‘xouddy | ‘xoaddy 
s0 
G 1 
G I 
7 ¢ a C'G 
¢ C'S 


peounourui {Ss 


I papis-ouQ 


AjIPRULION 
[9Ad’] DDURIYTUSIC 





On < [(22GZZ° + PrSs’ + Me’) fea puL 


On < *IGIZ" + SL 


On < [(F2Za" + txQ7° 4- 1G") ‘ ey puru 


on < ‘290° — *790'1 





ee oe 

tt < 8299" = 129071 
a teagt + Tagg? 

i ee 

ac ary pie 


on < 'xego’ — 'eco'T 


jt on < A ydaooy -GAdIS-ANO 


a> ((rZs" -b Arse ry’) 
on > 47E[Z t *resg) 
“ {is — i ‘ — ! S ‘ 
Unt a "Cx + NZ + re ) 


= 

~S 
/ 

¥ 


| ot > IrgQQ° — 2790'T 


| 
on > 20° — *720'T 
ow > "21° + *2E9° 
on > reco’ — 'rEco'T 


ji on > W4dad0Vy -dHdIS-ANO 


Joys jt on 4 WYdad9y /TVOINIAWIAS 


97909) dounaYfrUBrs papunog YPN 878a7 PODIIeUULs puD papis-aUo BUOY 
6 ATAVL 


s}saJ, 





nr 


RANGE-MIDRANGE TEST 261 


TABLE 3 
Power function values for test A 





Approximate Values of Power Function 









































Type | Sample Approx. | Significance 
Test | Size Efficiency Level — a ae _ oo a ee) 
| 6=4/6=1)6=15 6=2/5=2 
t | 5.4 | | .05 .244 | .607 | .886 | .969 
A | 6 | .90 | .05 | .259 | .599 | .868 | .967 | 
—_——|---—|---— ,----— —-—— —-— —-— —-— --— 
t 7S +} | .05 | .833 | .783 | .971 
A | 10 | 7 |  .05 351.779 .962 
| 
ea Oe eee | a. ee ok ta cee 
t | 5.88 | | 01 .O71 | .248 | .551 | .820| .957 
A | 6 | 98 Ol | 077.271 | 568 | .809 | .935 
ea ee a ee errr ee ——— |  - orrlvhor9h 
t 7.2 | O01. | .091 | .3871 | .749 | .949 
A s | 90 01. | .104 | .389 | .728 | .923 | 
t 8 | .01 | .108| .453 | .832 | .976 
A 10 80 | .O1 | .124 | .462]| .814 | .963 
TABLE 4 
Power efficiencies of tests A, B and C fora = 5%, 1% and 2<n<10 
n 
a —eeeeeenaenN . - . “ —_— anaemia - eo 
2 | 8 4 5 6 7 8 9 10 
01 | 100% | 99.7% 99.4% 99% | 98% | 95% | 90% | 85% | 80% 
05 100% | 98.5% 96% | 93.5% 90% | 86.5%) 82.5% 78.5%| 75% 
TABLE 5 
Approximate values of Da fora = 5%, 2.5%, 1%, 0.5% and 2<n<10 
n 
| @i a 4 | 5 | 6 7 s | 9 10 
0.5% | 31.83 | 3.02* | 1.37* | .85* | 66 | .55* | .475 | .425 | .39* 
1% | 15.91 | 2.11* | 1.04* | .71 | .56* | .475 | 49* | 38 | .35* 
2.5% | 6.35 | 1.30 74 | .52 | .48 | .375 | .33 | .30 | .275 
5% 3.16 | .90* | .55s* | .425 | .85* | .30 | .265 | .24 | .22,* 








* These values of D, were verified directly by substitution and integration. 
The remaining values of D, for 3 S n S 10 were obtained from these and other 
values of Da, (a + .005, .01, .025, .05), by graphical interpolation. 





262 JOHN E. WALSH 


plane defined by 
(1/2 + Da)an + (1/2 — Dari < wo, fa 2 Ms aN, Sb. 


TABLE 6 
Effect of non-normality on the significance level of the range-midrange test 





Probability Density Significance Level 









































































































































= Function mm A es 0 ee s 
- 7 - _Nommal -05 -025 = —. Tel —_ -01 | -005 [20 | 05 | 02 | 01 
3 | 1if0<zr<l .064|.039 |.018 |.010 |.064].039].018 |-010 128] .078) .036 020 
4 | 0 otherwise 058.033 017 | 0096 053} .033|.017 0006 106 .066|.034 |.0192 
5 | Mean = } 043.029 '.015 |.0094|.043 .029| 015 |.0094! .086).058|.030 |.0188 
3 ro wmeintie _|-086).017 0063 0031 .036|.017|.0063 0031 072.084) 0126| 0062 
4 ! |.043| 016 |.0055 0024] .043| .016| .0055 0024 .086 .032/ .0101| .0048 
——| Mean = 0 __~———— | | ||| |__| 
5 | .095  .026 | .0059) .0027| .095) .026| .0059) .0027) .190 052) 0118 0054 
3 | 822 if —1>r>1 |.119.104 |.073 050 |-119| 1041 .073 050 |.238 208) .146 |. 100 
4 0 otherwise 062.061 |.055. 045 062.061) .055 045 124) 122} 110 |.090 
5 | Mean = 0 -031|.031 |.031 | 029 031 031] .031 | 029 |.062 062] .062 |.058 
aes cecllcatacat ma t ncaele nae ceais —_ a piaiembalanabais Cae ieee | 
3 | e*if0<r<x 014] .0067| .0025| .00121.158'.108|.059 | .035 172) .115| .062 036 
"i eetaanein 013] .0048) .0016| .0007 144) 104.065 1 157] 109] .067 /-043 
8 | Mean = 1 017 0055 .0013| 0006] 1221 .096) .061 045 139) .102| .062 | 046 
3 | ari O<r<l | 035|.019 0075} .0038] .096| .061] .030 17 |.131 .080| .038 | .021 
Pe mn — ct ec cael 4 poe adden 
4 | 0 otherwise |.031/ .016 |.0065| .0031| .083] .055| .031 |.018 114} .071) .038 | .021 
5 ton? (028.015 0057] .0031 068) .050) .028 | o19 096.065) .032 020 
3 | az if 0<r<i 027) .014 0053) 0026 112.072) .037 021 130.086) .042 ont 
4 |e eeenion 024.011 00431 .0019 099] .067| 039 | .024 133 .078) 013 026 
5 | Mean = 2 | 028|.012 0037] .o019| .082 061) .036 1025 105 .0731.040 "027 





Likewise test B can be restated as 
Accept u > wo if (%1, Xn) falls in the region (B) defined by 


(1/2 — Da)tn + (1/2 + Dali > wo, W2M, ALN, HLS. 





RANGE-MIDRANGE TEST 263 


Test C now becomes 

Accept wp ~ wo tf (21, xn) falls in either of the regions (A) or (B). 

Figure 1 (i) contains a schematic diagram of the regions (A) and (B). Test A 
can be applied by constructing a graph of the region (A) and giving the instruc- 
tions to accept uw < wo if (a1, 2,) fallsin (A). Similarly for test B and region (B). 
Test C is applied by constructing a graph of both (A) and (B) and accepting 
bh ~ wo if (x, xn) falls in either (A) or (B). 

Frequently it is desirable to simultaneously consider more than one significance 
level. This can be accomplished in the manner indicated by Figure 1 (ii). 


5. Effect of non-normality on significance level. It has been shown that the 
range-midrange test compares very favorably with the Student t-test for suffi- 


(a,b) (b,b) 5% 25% 1% 05% 





(41) 


Fic. 1. ScHEMATIC DIAGRAMS OF REGIONS USED IN CONSTRUCTION OF GRAPHS 


ciently small samples and normality. In practice, however, it may happen that 
normality is assumed for cases in which the population is not even approximately 
normal. Although this represents an error in judgment on the part of the 
person applying the test, such situations will undoubtedly occur if the range- 
midrange test is used very frequently. The purpose of this section is to 
investigate the effect of non-normality on the significance level of the range- 
midrange test when the values of D,. based on normality are used. The cor- 
responding effect of these non-normal populations on the significance level of 
the t-test was not considered because of computational difficulties; however the 
effect of some other non-normal populations on the significance level of the 
t-test was experimentally investigated by Pearson in [3]. The results of this 
empirical investigation and of later investigations shows that the significance 
level of the t-test is not very sensitive to the requirement of normality for small 
samples. 


Six populations were chosen for investigation. Three of these populations are 








264 JOHN E. WALSH 


symmetrical while the remaining three are strongly asymmetrical. These 
particular populations were considered because their probability density func- 
tions have a wide variety of different shapes; also because the significance level 
of the range-midrange test can be computed in closed form for these populations. 

The populations investigated are defined by their probability density functions. 
Table 6 contains a list of the probability density functions considered along with 
the resulting significance levels for the range-midrange test. The cases in- 
vestigated are n = 3, 4,5 anda = 5%, 2.5%, 1%, 0.5%. Larger values of 
nm were not used because of computational difficulties. The situation of n = 2 
was not considered because the t-test and the range-midrange test are identical 
for this case. The significance levels of Table 6 were computed by making 
direct application of (1) and (2) of section 6. 


6. Significance level and power function derivations. The purpose of this 
section is to present derivations of the significance level and power function 
expressions which were used in the preceding sections. First a general probabil- 
ity expression will be evaluated. Direct applications of the results obtained 
for this expression yield the required significance level and power function 
relations. 

Let 2; and x, be the smallest and largest values, respectively, of a sample of 
size n drawn from a population with probability density function f(x). The 
non-zero probability range of this population is y < x < 6. Also let three 
constants ¢; , Cn, Co, (C1 + Cn = 1), be given and consider the value of 


Pr (e121 + CnX%, <o); Where M(z) = / f(y) dy. 
Using direct methods it is found that the value of this expression is given by 


[M (co) |" if c, = 0. 
0 fO<a<lac<y 


CG — ay n (co—e1 7) /en Co = Cn V n—1 3 
Mu (%—"7)' _ nf Mv) — M(@— 2" )) gy) at 
n co 1 


if O<aq <1, ct > 
@ i-fi+~3@y! ifc, = 1 
0 if cq > 1, c < min [y, cay + c,6]. 


B a r n—1 
iis I. —Cny)/e [cr <a (aaet )| sls aii 


- 4 (2#— 7) fa>l aytewb<a<y¥. 


, Cc —¢,V\|"" oe 
1— n| | aewv) — M ee f{V) dV if a>1, Co > ¥. 
co 1 


RANGE-MIDRANGE TEST 265 


The value of Pr(cia + ¢n%, < co) for c, < 0 can be obtained from the above 
results forc,; > 1. It is easily shown that 


(2) Pr(et1 + Catn < G) = 1 — Pr(ciy, — cayn < 00); 


where 
, , , 

Cy = Ca; Cn = (1, Co = —%,; 
and y; , yn are the smallest and largest values, respectively, of a sample oi size 
drawn from a population with probability function g(y) = f(—y). Thus if 
« <0, ci = c, > 1 and obvious modifications of the results for ¢, > 1 will 
furnish the value of Pr(ciy: + CnYn < Co): 

The above general results were used in section 5 to investigate the effect of 
non-normality on the significance level of the range-midrange test. 

Now consider the case in which the n sample values are drawn from a normal 
population with mean yz and variance o°. Then, for test A, 


Power Function = Pr{(1/2 — Da)a, + (1/2 + Da)tn < po} 
Pr{(1/2 — Daz + (1/2 + Dalen < 5}, 


where 


a = (%1 — w)/o, en = (Xn — w)/o, 5 = (uo — w)/o. 


Using the above results with 


2/2 1 P —x3/2 
f(x) = Fe ie M(z) = N(z) = Vax [Le ? dz, 


it is found that the power function for test A is 


= [ Ee - wii OB + BVA” av) dV if De < 1/2; 


(3) <(N@)]" if Da. = 1/2; 


n [. EG ~ ww io eae f(V) dV, if D.> 1/2. 


The value of D, (for given n) corresponding to a specified significance level a 
for test A is obtained by solving the equation 


(4) a = P,(0), 


where P4(4) is the power function for test A. From symmetry and the fact that 
test C is a combination of tests A and B, test B has significance level a and test 
C significance level 2a for this value of Da. 

For n = 2, test A becomes a Student t-test with one degree of freedom if D, 
is replaced by ta/2. The relation D. = ta/2 gives an easily applied method of 
computing D,. for this case. 

Approximate values of D. for a = 5%, 2.5%, 1%, 0.5% are contained in 








266 JOHN E. WALSH 


Table 5for2 <n < 10. For3 < n < 10, these values were obtained from (3) 
and (4) by approximate integration and interpolation. For n = 2, the relation 
between D, and t,. was used. 


PART II. SOME TESTS WITH BOUNDED SIGNIFICANCE LEVELS 
7. Introduction. In this part some significance tests (for the mean) are 
derived which are based on the assumption of a sample from a normal population. 
These tests have the property that the significance level is bounded near the 
value for normality under very general conditions. These conditions are 


(a) The observations used for a test are independent. 
(D) < (b) Each observation comes from a continuous symmetrical population 
with mean uz. 


‘ 


It is to be emphasized that no two observations are necessarily drawn from 
the same population. 

The bounded significance level tests developed are summarized in Table 2. 
These tests can be used to supplement the tests presented in [5] for n < 9, where 
the tests of [5] do not furnish a very wide variety of suitable significance levels. 


8. Outline of derivations. [et us consider the range-midrange test for the 
more general situation in which the set of independent observations used are from 
arbitrary but fixed populations satisfying conditions (D). Let D. be redefined 
so that the resulting test 1 has significance level a. Then it is easily seen that 
D, is a monotone decreasing function a. Thus the significance level of the 
modified test A will always be less than or equal to (1/2)” if Da > 1/2. The 
significance level bounds for the testsn = 4,a = 5%;n = 5, a = 2.5%; n = 6, 
a = 1%;n = 7, a = 0.5% of Table 2 were obtained from this relation and 
obvious significance level relations among tests A, B and C. 

The significance levels (for normality) for the tests n = 5,a = 5%;n = 6, 
a = 25%;n = 7,a = 1%; n = 8,a = 0.5% were obtained by approximate 
integration of the expression derived for Pr{(1/2 + c)r, + (1/2 — c)an1 < yl], 
(0 < c < 1/2), for several values of c and then graphical interpolation (here a 
is the one-sided test significance level). The significance level bounds were 
determined from 


(1/2)” = Pr(x, < uw) < Pr{(1/2 + c)a, + (1/2 — c)tnu < pu] 
< Pr{(1/2)(an + tn-1) < wu] = (1/2)"". 


The significance levels for the tests n = 8,a = 1%; = 9,a = 0.5% were 
obtained by considering the relations 


Pr{max [z,-1, (@n + tu+)/2] < w} = (1 + 7)(1/2)",  @ = 0, 1, 2, 8), 
and applying linear interpolation to find a value c, (0 < c < 1/2), such that 


RANGE-MIDRANGE TEST 267 


Pr{max [a,-1, 0.5¢, + ctn-2 + (1/2 — c)an1] < uw} has the desired value. 
The significance level bounds were found from 


Pr{ (1/2) (an + %n-1) < wu} < Pr{max[ay_; , 0.52, + ctn-2 + (4 — c)2n-1] < pw} 


< Pr{max[xn-1, (1/2)(%n + 2n-2)] < pz}. 


The derivation of the power efficiencies listed in Table 2 will not be considered 
here. Detailed derivations can be found in [7]. 


REFERENCES 


(1] J. F. Daty, ‘‘On the use of the sample range in an analogue of Students’ t-test,’”? Annals 
of Math. Stat., Vol. 17 (1946), pp. 71-74. 

[2] E. Lorn, ‘“‘The use of range in place of standard deviation in t-test,’? Biometrika, Vol. 34 
(1947), pp. 41-67. 

[3] E. S. Pearson, ‘‘The distribution of frequency constants in small samples from non- 
normal symmetrical and skew populations,’ Biometrika, Vol. 21 (1929), pp. 280-286. 

[4] Joun E. Watsu, ‘“‘On the power function of the sign test for slippage of means,’’ Annals 
of Math. Stat., Vol. 17 (1946), pp. 358-362. 

[5] Jonn E. Watsu, ‘“‘Some significance tests for the median which are valid under very 
general conditions,’’ Annals of Math. Stat., Vol. 20 (1949), pp. 64-81. 
610-611. Submitted for publication in Annals of Math. Stat. 

[6] N. L. Jounson anv B. L. Wetcn, ‘‘Applications of the non-central ¢-distribution,”’ 
Biometrika, Vol. 31 (1940), p. 376. 

[7] Joun E. Watsn, ‘‘Some significance tests for the median which are valid under very 
general conditions,’’ wnpublished thesis, Princeton University Library, Princeton, 
N. J. 








ASYMPTOTIC STUDENTIZATION IN TESTING OF HYPOTHESES 
By HreRMAN CHERNOFF! 


Cowles Commission for Research in Economics 


1. Summary. A method suggested by Wald for finding critical regions of 
almost constant size and various modifications are considered. Under reasonable 
conditions the sth step of this method gives a critical region of size a + R,(6) 
where @ is the unknown value of the nuisance parameter, R.(@) = O(N*”) and N 
is the sample size. The first step of this method gives the region which is 
obtained by assuming that an estimate 6 of the nuisance parameter is actually 
equal to 0. 


2. Introduction. The problem of nuisance parameters often arises in the 
testing of hypotheses in the following form: It is desired to construct a test of a 
hypothesis H so that the probability of rejecting H if it is true is equal to a. 
However the probability distribution of the data is not uniquely determined 
by H. Indeed, if the hypothesis is true then the observations have a distribution 
depending on a nuisance parameter @ whose value is not known. Generally a 
critical region will have a size which depends on the value of 6. Neyman has 
done considerable work on the problem of finding similar regions, i.e., regions 
whose size is independent of @. 

Wald has suggested the following method of finding critical regions whose 
size is almost independent of 6. Suppose that ¢ is a statistic such that if 6 
were known then the critical region t < c:(@) would be a good critical region 
for testing the hypothesis H. Suppose also that 6 is an estimate of 6 and that 
g(t, 6| @) represents the joint distribution of t, 6 under H when @ is the value 
of the nuisance parameter. Then consider the regions 


t < (6) where Pr{t < «(6)} =a independent of 4; 
t < ¢,(6) + (6) - Pr{t — c,(@) < c(6)} = @ independent of 6; 
t<¢(6) + --- + ,(6) . Pr{t — (6) --- —c,1(6) < c.(0)} = a 


independent of @. 


Under the assumption that 6 is close to @ it is reasonable to expect that 
Pr{t < ¢(6)} would be close to a. It might also be expected that 
Pr{t < c(6) + c2(6)} would be even closer to a. 

This method has been shown to have good properties when considered from 
the asymptotic point of view. Suppose that t, 6 are two sequences of statistics 

1 This paper is based on a dissertation written under the supervision of Professor Abra- 
ham Wald and submitted as partial fulfilment of the requirements for Ph.D. in the Gradu- 
ate Division of Applied Mathematics of Brown University. 


268 


C= = Le 


—_—_— ~~ Fe rh Cs 


ASYMPTOTIC STUDENTIZATION 269 


(depending on N, the size of the sample or an analogous variable) with distribu- 
tion represented by g(t, 6| 6) where N is understood to be present. Then it has 
been shown that under reasonable conditions, with modifications for the sake of 
calculation, 


| Pr{t < (6) + --- + ¢,(6)} —a| = O(N *”). 


The statement of the theorem presenting this result will be given in section 4. 


It has also been shown that if roughly speaking @ is distributed almost sym- 
metrically about 6, the above result may be obtained in half the steps, i.e., 


| Prit < (6) + --- + ¢,(6)} — a| = O(N *). 


It is true that under relatively weak conditions and for fixed N it is possible for 
any e > 0 to obtain a function h(@) such that | Pr{t < h(6)} —a| <«. However 
such a critical region can have very poor properties from the point of view of the 
alternative hypotheses especially if h(6) is a very wildly oscillating function. 
On the other hand this objection does not apply to Wald’s method for large N 
because 


| {"(6)| <M r= 0,1,-+-,8; 
| o$(6) | < MN7” phon ats 
[$(0)| << MNO? r = 0,1, 

and hence (6) + --- + c2(6) is almost constant over “that small range in 


which 6 will probably fall.” 

In the above it has been implied that 6 is a one dimensional variable. However 
the results are easily extended to the case where @ is a k-dimensional variable. 

The direct application of the method is often quite difficult because of the 
calculations involved. Modifications can be applied which simplify the cal- 
culations. Such modification usually consist of changing the c,(6) by a small 
amount provided the remainder is simple and “well behaved.” A case where 
considerable simplifications can be made is that where g(t | 6, @), the conditional 
distribution of t, can be expanded in a Taylor Expansion, 


a(t | 8,0) = gr(cr() |0,6) + ¢- a) + 6-0 & 


1< - sia ll ; 

+ per + 3! a (t — ¢(6))’(@ — 6) . atiage- gilt | 6” 0), 
where the partial derivatives “behave.” This case will be described in detail in 
section 3, and an example previously treated by Welch (see [1]) will be discussed 
in section 4. 

Another case where simplifications often arise is the asymptotic case, that is 
the case where g(t, 6| 6) has an asymptotic expansion. The asymptotic case 








270 HERMAN CHERNOFF 


may also be regarded as an extension of the following partition principle which 
is very useful. If g(t, 6! 6) = go(t, 6| 6) + hit, 6| 6) and If h|\dtd6 << MN" 


and if 9(6) is such that 


pr e(8) ; / 

F dé I. dt go(t,6|0)— a < MN”, 

then | Pr{t < 9(6)} — a| < MN~*”. Thus our theorems apply to g(t, 6| 6) 
if g = go + h where g has sufficient differentiability properties. 


3. The Taylor expansion treatment. Let g(/,4|6) = gilt | 6, 6)g2(@| 6) where 
g: is the conditional density of ¢ given 6 and go(6 | @) is the marginal density of 6. 


g3(t | 0) = | dég(t, 6 | 8) is the marginal density of ¢. In what follows we shall 


use M asageneric bound. Thus the statement f(t, 0) << M(6,, 2),0: << 0 < 62, 
means that there is a constant J depending on (6@,, 92) and independent of 
t, 0, N so that f(t, 0) < M(0:,)0<0< 0. 

First we obtain ¢,(@) so that Pr{t < o(6)} = a. 
Then we have 

THEOREM 1. Jf for every finite interval (0; , 62), 


| 
(i) oe o+ a) < Gilt, 0) << Gt), |A| <A’, &,N), p=0,1,---s 


A<00+AS h&, 


where [ G.{t) dt < M(@,, 9), G, and Gz, may depend on N, 6, and 62 


90 Patt zs continuous in t, 6 and 


bounded in absolute value by M(Ci, C2, 1, &) forptq<s,4A 56K bh, 
G<st<G.; 


es << @,) ~< gate, 4) for 45058, <t< C2; 


a C2, A, 
(iv)0 <a <1, 


then Pr{t < c(6)} = a defines c(0) uniquely and so that | cy” (0) | < M(@,, 4) 
forp = 0,1, - -",8A SOSH. 

Proor. Since g3(t| @) is positive, c:(@) is uniquely defined by condition (i). 
From this and conditions (i) and (ii) it follows that cj(@) exists and is given by 


c4(9) 
(1) : dt —— fs (t | | 6) + c1(0) gsc, (8) | 0). 


ASYMPTOTIC STUDENTIZATION 271 


We may continue in this fashion differentiating formally p < s times to get 


1 Pag a . a al 
@) faa BLO - Srepmenntes Ol «++ fee] al 10 


+ cf” (6) g3(cx (8) | @) _—_ Q, Nod, ere Dis ti, nt Ui a +Z < p. 





From the continuity and positiveness it follows that c”(@) is continuous. Since 


[ G.2(t) dt < M(@,, 6) it follows that there is a constant M(6,, 6) so that 


—M (6; ,92) eo 
| Gi(t) dt < «, [ Oia <1 -~<«. 
= M (0; 82) 


Thus 
| (8) < M(6,, 62). 


From (1) and condition (i) it follows easily that | ¢:(0) | < M(@,, 6). Similarly 
we obtain | ¢”(6) | < M(@, 62) for 0: < 0 < @&. 

While the conditions (i) to (iv) suffice to insure the results of the theorem 
they are not necessary. It is often possible to obtain these properties of c;(6) 
in particular examples where g;(t, 8) does vanish at points so long as g3(¢:(@), @) 
behaves well. 

DEFINITION 1. (6) is an admissible function of order m(m < 8s, s fixed in 


advance) if gm(6) = (6) + -+- + em(6) where Pr{t < c:(0)} = a and 
(3) | c§?(0)|<M(,0)N O°", p=0,1,---,st1l—-tA<c0< &. 
Now let 


(4) H,(0) = N*? E(6 — 6)" = N*? | (6 — 6)*g(6| 0) dé and 
g?t? s 
(5) Gal) = she 5j0 2(E | 8, 8) [emer dae » 


We have 
THEOREM 2. If 
(i) Prft<«(0)} =a, O<a<1, and |{c§”(6)| < M(H, &), 
A< 90S h,p=O0,1,°*:,8; 
(ii) 6 = 6(N) = O(1) is a function of N such that 


[ dé | 6 — 6 \*go(6| 0) < M(0., )N°", 0 < 0 < & hk =0,1, +++, 8; 
16—6| >8 


, grt ‘ 
(iii) | aypage C1 8, 0) | S M(H, &), pta=s, 


\t — c(0)| <p, [6 — | <4, 








272 HERMAN CHERNOFF 


where 


p= Max. | a (6) — a(6)|+N O°, 9 >0, &SO< &; 


)@-6|s 
(iv) |H{?(0)|<M(a,%) for p=0,1,-+-,s—kh k=1,++*y8, 
ASO bh; 
(v) | G5(0)|<M(i,6) for 1=0,1,++-,s—p—gq, 
pt+qse-—1; 


(vi) ¢m(6) is an admissible function of order m < s, 
then 


(6) Pr{t < ¢n(68)} =at tm(0)N” eee + 1me(O)N 
where 
| r§?(0) | < M(@, 62) for p=0,1,---,8—4J, $54 6,<0< bh. 


Proor. Expand g,(t| 6, 6) in a Taylor Expansion about t = (8), 6 = 8, 
with remainder terms of order s in t — ¢(0), 6 — 6, and expand c¢;(6) about 
6 = 6 where the remainder term is of orders + 1 — 7. Then for | 6 — 6| < 4, 
we have 


¢m(6) f 
fault} 6, 0) at = PIO — 0)', £00), Gye} + RN, 


1 (8) 


where P is a polynomial and | R | < M(6, , 6)2_ (6 — 6)° ‘N **for|6—6| < 6. 
t=(0 


Integrating over | 6 — 6| < 6, we use conditions (ii), (iv) and (v) and the 
theorem follows. By a similar argument we have 

THEOREM 3. If 
(i) the conditions of Theorem 2 hold for each (6; , 62) so that 


—-x~ lB <A<h< hoo 
and 
(ii) gi(ei(@) | 6, 0) > (1/M(A, 02) > 0, A<d6S h, 
then the sequence 
¢1(6) = c1(8); 
¢2(8) e(6) — ra(6)N~”; 


l'm—1 im~1(0) earn 
gi(ci(6) | 6, 6) 


is a sequence of admissible functions such that 


(8) 


om) = omr(6) — ‘ m<s 


(a) Pr{t < om(6)} = a + R(O)N~™”, m<-s, 


ASYMPTOTIC STUDENTIZATION 273 


where | R(@) | < M (4, ’ 62) for By < A; < 6 < A. < Be . 

These theorems permit us to obtain and to calculate critical regions whose 
size is asymptotically close to a. 
In Theorem 2, condition (ii) was much stronger than necessary. It may be 


relaxed if we define 


H,(6) = / N*"92(6 


|6—0|<38 





0)(6 — 6)‘ de, 
where 


Pr{}6—6|>6} < M(,,0)N”", 6 = 6(N) = O(1). 


However this may complicate the calculations. 
The symmetric case arises when the first moment almost vanishes, i.e. 


(10) | H{”(0)| < M(,,@)N~”, p=0,1,---,s—1, ASO<h. 
In this case we have instead of the sequence given in Theorem 3, the sequence 
oi(6) = c(6); 

r1,2(6)N* + 11,36)N°? | 

91(cx(6) | 6, 6) 
n—1,2m-00)N~? + rn-1,2m—1(6) NO 

gi(cr(6) | 6, 6) 
which is a sequence of admissible functions such that 
Pr{t < om(6)} = a+ rmam(@)N” + +++ + 1me(O)N” 
| rS?(0)| << M(i,%) &850<% =p=0,1,---,s—n. 


(11) ¢2(6) = c (8) —_ 


m6) a m-1(0) — 


’ 


4. An example. The following example previously treated by Welch from a 
different point of view will furnish an illustration of the applicability of the 
theorems to the case where @ is a k dimensional parameter. It will also serve 
as an example of an extended type of symmetry. That is, it has the property 
that | H$22,(0) | < (6, , @)N~””, and hence, in the sequence (11), the rm,2m+1(0) 
terms effectively vanish thereby simplifying the calculations considerably. 

We suppose that ¢ is a normally distributed variable with mean yu and variance 
o = dor + --- + Axoz Where the A; are known positive constants, the oj are 
unknown parameters each of which is independently estimated by s; where 
N,s7/o7 has the x° distribution with N; degrees of freedom. It is desired to test 
the hypothesis that » = 0 so that the probability of rejecting the hypothesis 
if it is true should equal a. Under the hypothesis the joint density distribution 
of t, si, -++ 8, is given by 
: seca TE nh ee 
(12) g(t, Si, °**, & O1,°°*, o%) = Vine? LL asi a3; N3), 


t=1 








274 HERMAN CHERNOFF 
where the moments of s; — oj = 6; — 6; are given by the coefficients of ué/k! 
in the expansion about u = 0 of e "(1 — (2Quei/N;))-**"*: 
H,(c;) = 0; 
H2(o3) = 203 ; 
H;(0;) sere 4aiN;"”: 


2 I Nz? 8 
H4(o;) a (¥ — 2N t Jo; . 
We define ¢,(@) by Pr{t < ¢,(@)! = a where 
6 = (91, 02,°**,0%) and 6 = (81, 82,°°* 5%), 


(0) = (10. 


- diate ( . ‘ ae es 
Now a2(@) — a = Prie(@) < ¢ < e(@)} may be computed within terms of order 
N;~ by expanding 


A, ./.~ 25 2 L , / 2 2\ 7.2 2 
c:(6) ww He —- G du 5, hlsi — ;) mem 1 >; goa MAMAS ad a3) (8; = g;) 





ee {1+ (t — ¢is)(—a/o)}, 
V/ 2ro* * Pia ; re , 
whence 
; P — melt C1 2 2 
(A) — os oo [dst ds i(|oi3 Ni) oS Alsi — oF 
a(@) — a I ; 8} 2 TL ove! lo; Dyas =} - 2 S o%) 
Cc 2 2) 2 2 
ay Ly Ms As(Si — o8)(8} — oj) — (2 5 LL MAG(Si — 0) (8; — oh 
—c2/2 
é {at a $ 401 (Sr yr? 
os RE Bs wh. iaoiNs } + O(.)_ N77). 
Vint | 808 ante aes ee es 
Thus 
} 3 
a0) = S559 Do Moe 
and 
as(0) = Prit<as+ °° = 4 > risiNe'} = a + O(N), 
where 


= Zz NS: « 


Further approximations become somewhat complex and should be carried out 
in a systematic fashion. 


5. Remarks. The range of application in practical statistical prob lems of the 
theorems of section 2 may be somewhat more limited than that of the original 


ASYMPTOTIC STUDENTIZATION 275 


method proposed by Wald. Concerning the original method, the following 
theorems have been established. 
THEOREM 4. If 


(i) Prf{t < «(6)} = a,0 <a <1, where! cf? (0)! < M(O,, ), 0, << 6 < be, 


a off, 6+ A16+ A), 


ii) oe < G6, 0) fori +j7<s—-—1,0C, <i <C,, 

(i) 1 5a < Gb, 0) fori + j < StS, 
6.<0,0+A<6,/A! < A’, where GOO, 6) de pends on Cy , Co. 0; , & . N, 
and is integrable in 6 over (— ©, &); 

ose g°ti f 6! 6) ; s : 

(ii) g( < LG, ),i +9 58-1, <15%,4<0<68 


atid A? 
where [ L(6, 0)|6 — 0|* dd < M(O,, 0, Ci, C:)N*", k = 0,1, 


(iv) O< ACL, Cr, 6,6) <A) < galt | 0) < BY) < BCL, Cr, 1, &)< a, 


ASO0<&, Sts, 


/ B(t) dt < M(6, &): 


(v) g(t, 6| 6) > 0, 
then a sequence cr (6), c2(0), co (8), «++, ce (6), exists where Cm(@) is uniquely defined in 
(0; , A) by Prit — cr (6) — --- —ca_1(6) < em(8)} = a, and 
lcS?(8)| < M(0,,0)N ">? p = 0,1,---,s—m+1,4<6< b 
and cn(0) is any function so that 
lcm’? (0) — c§?(0)| < MN-™ for <0< &,p=0,1,-:-,s—™m, 


and 


(m—1)/2 


.¥ (Pp) 


Um ( 6) | 


lA 


M(6,, 02)N —-~7 <§6< wo p=0,1,---,s—mil- 


Finally for ¢,,*(@) arbitrary within the above conditions, 


| Prit = ci (6) — tee c* (6) <0} —a\ < M(6,, 6.)N°" for A<6< & 


The conditions on the derivatives with respect to A are natural because 
the intuitive approach to the method seems to hinge on the assumption that 
g(t, @ + A|@+ A) changes gradually with respect to A “independent” of the 
value of N. This would not be true of g(t, 6| @ + A) for large N. 

The c; (0) where introduced in Theorem 4 because in practical examples it is 
usually found too difficult to compute ¢;(@) efficiently. On the other hand 
there are many alternative ways of obtaining functions with the properties 








276 HERMAN CHERNOFF 


of the c;(@). The c2(6), c3(6) ete. mentioned in Theorems 1, 2, 3 play the role 

of the c; (6) in Theorem 4 with the exception of the condition on c; (6) for outside 

(0; , ). The exception is due to the fact that the Theorems 1, 2, 3 correspond 

to the “infinite case.” Theorem 4 is applicable to those cases where one is 

willing to assume that @ lies in (6, 6). It often happens that there is no such 

reason or that the conditions of the theorem hold only for every closed proper 

subinterval of (6, , 8) but not for 6; < 6 < ® itself. In these cases we may 

apply 

THEOREM 5. If 

(i) all of the conditions of Theorem 4 apply to every finite proper closed subinterval 
(0: , 42) of (81, B2) where (8; , 82) may be an infinite interval; 

Gi) Pr{j6— 6! > 6(N)} < M(6,, &)N°” for B, < 0 < 0 < O < Bo, where 
6(N) = O(1) unless B, or Bo is finite, in which case 6(N) = o(1), then a 


sequence cr (6), ¢2(0), c2 (6), --- , ce (6), exists, where cn(6) is uniquely defined in 
(8), Be) by Pr{t — e*(6) — c2*(6) — --+ — cnr1(6) < en(0)} = a, so that for every 
(0; ’ 62), 


| em” (8) | < MO, )N "if B1 < A SOS < Be, 
p=0,1,---,s—m+l 
and for c.(6) arbitrary within the above conditions 
| Prit < er (6) + +++ + en(6)} — a| < M(H, 62)N-™” 
af BLAS OS Oe << Boysm<s. 


Essentially this theorem can be proved by reference to the proof of Theorem 4 
applied to the function 


g*(t, 6| 0) = g(t, 6|6) for |6-—6| <6; 
= 0 (6 --0|>6. 


Some of the conditions in Theorems 4 and 5 are stronger than necessary. For 
example g > 0 may be replaced by a weaker condition where g is positive in a 
region about t = ¢,(@). On the other hand the condition Pr{| 6 — @| > 6} < 
MN *” in Theorem 5 is necessary to the argument used in the proof. It is easy 
to construct trivial examples where the results of this theorem apply although 
this condition is not satisfied. However an example has also been constructed 
where all the conditions of Theorem 5 hold except for this condition and the 
method of Wald fails to give the results. 

These theorems are very easily extended to the /:-dimensional parameter case 
by replacing the conditions on the derivatives with respect to A by the same 
order mixed derivatives with respect to A; , Ao, -++ , Ax of 


g(t, 6, + Ar, 6. + Ao, +, 6 + Al Oy + Ar, s++, & +- A;,). 


The symmetric case arises when the distribution of 6 is almost symmetric 
about 6. More exactly we have 


ASYMPTOTIC STUDENTIZATION 277 


THEOREM 6. If 
(i) All the conditions of Theorem 4 hold and L(6, 0) has the additional property that 


[ (6 — 6)°L(6, 0) db < M(4,6)N7, %<0<%, 


and 
dg**? ae ag'? a | a a 
| 29 __ 66) — “2 _ (4, 26 — 61 6)| < L(6,6)|6 — al, 
(ii) | OA‘ Ot (t, 0 | @) OA‘ ot? ( 66) | < L(6, 6) |6 — @| 
(sts G., $< 060< he, t+j<cs-—1, 
then it is possible to construct a sequence cz (6), c2(6), --- ,c7 (8), as in Theorem 4 
so that 


| o§?(0) | < M(@., 62)N~”"™, 
p=0,1,---,s—-2m+2,4<60< h; 
ck (9) — c6?(0) | < M(&, )N~”, 
p=0,1,---,s—-2m+1,4<0<6; 
| ck (6) | < M(G,, &)N~", 
p=0,1,---,s—-2m+2,-0 <6< c; 
and 


| Prit< cr (6) fees + cn (4) } —a|< M(h, 6.)N~*”?, 
A<6< &,r= Eg 


Theorem 5 can also be extended to the symmetric case. 

It is often possible in the theory of statistics to obtain an asymptotic expansion 
of the distribution of t, 6. The treatment of such cases is often very simple 
because of the prominent role played by the normal distribution in such 
asymptotic expansions. Suppose that 


g(t, 6| 0) = VNrit, v | 4), 
where y = ~/N(6 — 6);7 = density distribution of (t, y); 
v(t, vy | 6) = volt, vy | 6) + + a(t, y | 6) + a + alien , ¥s-1(t, v | 6) 
+ p(t, y| aN”, 


Yo, Y15°** » Ys-1 are independent of NV; 


[[ieidva<ma,0), %<0<e; 


[[iviapars MG,0), &< ese. 








278 HERMAN CHERNOFF 


Correspondingly we have 
g(t, 6 | 6) = go(t, 6| 6) + N “g(t, 6] 0) + eee +N 


ger(t, 6} 0) + r(t, 6 | 6)N~*?, 
where 


g(t, 6|0) = VN vit, | 6), r(t, 616) = WYNolt, | 4). 


> 


cs c, (0) 
Then if we define ¢;(@) by | dé [ dtgo = a 


co 
ne Cy (O)+-++++Om — 1 (0) +0m (0) 
Cm(8) by | dé : dtgo 
J—co C1 (O)+---+em _ 1 (8) 
ae 204 (8)-++++-bem— 1 (8) 
=a- do | dtlgo + aN? + e+e + gusN 1). 
vx 06 


or bv 
eo 


em(0) [ dOgo(ex(8), 8 6) 
0 


= €1(6)++0e+0m —1(6) )/2) 
7A } y-lj2 (m—1) /2 
=e | dé [ dt[go + gN + eee + gmiN °" ''"), 
J— 2 J—x 
we obtain 


9° 


;Prit < co (6) + ++» +.¢.(6)} —a| < M(0,, &)N~”, 
if g obeys the conditions of Theorem 4 except that we need only s — 7 + 1 
derivatives for gi(t, 6| 6). The above definitions of cm(@) correspond to the 
c.(0) in Theorem 4. Analogues of Theorems 5 and 6 also apply to the asymptotic 
case. 


REFERENCE 


11] B. L. Wetcu, ‘On the studentization of several variances,’’? Annals of Math. Stat., 
Vol. 18 (1947), p. 118. 


SOME LOW MOMENTS OF ORDER STATISTICS 
By H. J. Gopwin 


University College of Swansea, Wales 


1. Introduction. In a paper on order statistics from several populations 
[1], there were given, among other results, the means, variances, covariances, and 
correlations of order statistics in samples of ten or less from a normal population. 
These were obtained by numerical integration, and on account of the difficulties 
arising therefrom, some results were given to only two decimal places. More 
recently, Jones [3] has shown that some of the integrals, for sample sizes not 
greater than four, can be evaluated explicitly. 

In this note these results are supplemented in two ways. For a paper which 
the author has recently submitted to Biometrika integrals were evaluated which 
can be used to give some of the results in [1] to more places of decimals. It is 
also shown that the table of explicit values can be extended. 


2. Approximate values. Let the population studied be normal with mean 
zero and variance unity, and let the members of a sample of n be x(1 | n) > 


a(2|n) > --- >a(r}n). The integrals available are 
aa ° - 
a [ F'(x)( — F(z))' del <i <3), and 
a) 


vii = [ Pe [ a-FY) drat <i,j;6 +4 < 10), 


where 





’ : z z 1 442 
F(z) = [10 ae = t 


These were evaluated to ten places of decimals, the last place possibly being in 
error by one or two units. 
For the purpose in hand we define also 


a 


ai, j)= | «Fw — F(a))i dx = —a(j, i), 
and 
0,7) = | xy@)Fi@) — F(a) dr = 38(j, 0. 


Now, on integrating by parts, we have 


| @= FW) dy = -20 — F@) + | vs) ay, 


97 








280 H. J. GODWIN 


and for f(x) as defined above (so that in what follows we restrict ourselves to the 
normal distribution only), the second integral is f(x). Hence ¥(z, 1) + a(i, 1) = 
1/(¢ + 1) and we can construct a table of a’s by using also the relation 


a(t,j) — a@ + 1,9) = a(t,j + 1). 


Again, on integrating by parts, we have 





2 itl ; ; 
oe ee [ ; : 7 (ix? f(x)(1 — F(x))*? — 2e(1 — F(z))*} dxlé > 0) 





i ; . so 2 , 
* {4a — 1,4 — 1) — BG, dD} - 1 a(t + 1,2), 
using the fact that, in this particular case, 2F — 1 is an odd function and 
F(1i — F) an even function of zx. 
Hence 6(i, 7) = a+] Bi-—1li-—1)- i a(t + 1, 7), and using 


B(i, 7) — B@ + 1,7) = BC, 7 + 1) we can find the @’s. 


Finally we put y(?, 7) = whet at, J) = ¥(2, J) 





which can be shown by 


7) 
an integration to be equal in this case to y(j, 2). 
Now 
(1) E(x(i|n) — xi +1]n)) = rc. [ F"“*(x)(1 — F(x))* dz, 


as was proved by Irwin [2]. By the symmetry here this integral is the same 
if i, n — ¢ are interchanged, and since F*(1 — F)’ + F’(1 — F)* isa polynomial 
in F(1 — F) (as may be seen by putting F = 4} + G) the integrals (1) can be 
expressed in terms of the ¥(z). Using the fact that the expected value of the 
median is zero the E(x(z | n)) follow. 


The frequency function of x = x(i| n) is 
n! uit eee 
G—Din—»! f@a — F(a))" F’ (x), 
and so ‘ 
(2) E(x(i|n)) =i "C; BG — 1,2 — 2). 


The joint frequency function of x; = x(i|n) and x; = 2x(j|n) is 


n! r _— B(r.)\i-} a es \)\s-=1 pei... | 
Go DiGrin Dina pi rsa F(x;))" (F(a) — F(z;)) FF"? (x;) 


(taking j > 72), and to find E(x; x;) we multiply by 2; x; and integrate, z; going 


\Y 


Ww wa — 


LOW MOMENTS 281 


from —% to ©, and a;from2z;to ©. Onexpanding (1—F(x;)—(1—F(x,)))7 
by the multinomial theorem a typical term is 


(3) [ | tex fedfa)(l — F@)) PY (es dx; da. 


TABLE 1 


Means and standard deviations 

















Statistic | Mean | Deviation | Statistic | Mean — 
a(1|2) 5641896 8256453 a(1|8) | 1.4236003 6106530 
x(1|3) | .8462844 | .7479754 | x(2'8) | .8522249 | .4892862 
2(2|3) | 0 .6698292 | x(3/8) | .4728225 | .4480723 
v(il4) | 1.0293754  .7012241 x(4\8) | .1525144 | .4326503 
a(2\4) | .2970114  .6003793 | x(1/9) 1.4850132 5977903 
v(1|5) | 1.1629645  .6689799 | 2(2/9) | .9322975 | .4750755 
x(2\5) | .4950190 5581388 | x(3)9) .5719708 | .4317205 
x(3|5) 0 .5355685 | 2(4/9) | .2745259 | .4129877 
a(1|6) 1.2672064 .6449241 | 2x(5/9) | 0 | .4075553 
x(2\6) | .6417550 | .5287511 | 2x(1|10) | 1.5387527 | .5868083 
x(3\6) | .2015468 | .4961981 | 2x(2\10) | 1.0013571 | .4631674 
a(1|7) | 1.3521784 | .6260334 x(3|10) | .6560591 | .4183339 
x(2\7) | .7573743 | .5066882 | 2(4/10) | .3757647 3974153 
2(3\7) | .8527070 | .4687447 x(5|10) | .1226678 | .3886565 
x(4|7) 0 | .4587449 


| | | 
We integrate by parts with respect to x; and then with respect to x; : the integral 
(3) is then seen to be yi + 7,n —7 +8+4+ 1), and 
“is jti-1 j-i-1-r 
E ) ————— 
@r) =-GopG-i- pimp! & & 


(PG —t— Yt wiiite, n—-j+s+1). 


(4) 
‘risl(j —-i-1l—r—s)! 


Using (1), (2) and (4), the values in Tables 1, 2, and 3 are obtained. The 
values are estimated to be correct, except for sample sizes 9 and 10, for which 
there may be errors of one or tio units in the last place given. Missing values 
are filled in by considerations of symmetry. 


3. Exact values. All the integrals occurring for (7) or ¥(7, 7) can, by suitable 
transformations, and the integration of one variable over the range —~ to », 





282 IH. Je GODWIN 


TABLE 2 


Variances and covariances 














j 
7 z . ee — —— —————— 
zz 2 3 4} 5 |6 {174s | 9 | 10 
—— — = - - — — —_-— = a —- 
2 1/.68169|.31831 | | 
3 | 1|.55947).27566). 16487 | | 
2) .44867 | 
t | 1|.49172).24559). 15801). 10468 | 
2) |.36046) . 23594 
—— istina Sapiceentiiosaadl ——— — ae tli _ ante aw a 
| 
5 | 1|.44753|. 22433). 14815}. 10577] .07422 
2) 31152 . 20844). 14994! | | 
3) . 28683 | 


6  1|.41593).20850, . 13944. 10243) .07736, .05634 


























2 |.27958 . 18899). 13966). 10591 
3) 24621). 18327] | 
Ge comme ON nd ee! a ee 
7 1} .39192' . 19620! . 13212  .09849) .07656) .05992 .04480 | | 
2 . 25673). 17448) . 13073! . 10196] .07998 
3 .21972 . 16556). 12960 | | | 
4 21045 | | 
| 
8  1).37290 . 18631, . 12597) .09472) .07477) .06021).04830 .03684! 
2 23940 . 16320 . 12326) .09757| .07872: .06325 | | 
3 20077. 15236). 12096) .09782 | 
4 .18719 .14918 | 
9 1|.35735'.17814 . 12075 .09131).07274! 05948) .04908' 04009) .03106 
2 22570 15412 . 11701) .09345) 07655! .06324).05171 
3 18638 . 14208). 11377) .09336, .07723 
f 17056). 13699) . 11267 | 
5 16610 
10 1 .34434.17126 .11626 .08825).07074, .05840).04892 .04108) .03404).02675 
2 21452 . 14662 .11170 .08974 .07420. .06222) .05232! 04336! 
3 .17500 . 13380). 10774 .08923) 07492, .06302 
! . 15794 £12751). 10579) .O8895 
5 15105). 12560 





LOW MOMENTS 283 


TABLE 3 


Correlations between order statistics 























2 3 4 o 6 7 8 9 10 
2 L 4669 
3 1 5502) 2947 
4 1 5834) .3753).2129 
6546) 
iii al — — | a itl saan in Ses 
5 1 | 6008) 4135]. 2833 . 1658 
2 6973} 4813 
6 1 |.6114|.4357|.3201) 2269) .1355 
2 a 5323) .3788 
3 7444 | 
7 | 1 6185-4502! .3429). 2609). 1889|.1143 
2 ' 7346). 5624) 4293 . 3115 | 
3 '.7699) .5899) | 
8 1 6236) 4604). 3585). 2830| .2200 . 1617! .0988 
2 +7444) .5823|.4609|.3591 .2642 | 
3 | . 7859}. 6240). 4872 | 
4 | 7969 
9 ana 2986) .2409 . 1902. 14121 0869 
7514). 5964) 4827) .3902) 3083! .2291 
3 - 7969) .6466) 5236 4144 
4 | | 8139) .6606 | | 
10 6301).4736|.3784) 31021. 2561|.2098|.1674).1252 .0777 
2 .7567| .6068) . 4985] . 4122) .3380) .2700) .2021 
3 .8048 .6627) 5488) .4507) .3601 
8255) .6849, .5632 
5) 6015 
be represented as multiples of [> fe dx dy --- , where Q is a positive- 


definite quadratic form in the variables of integration. 








284 H. J. GODWIN 


Now if Q is ax’, the integral is }./z/a (this is, in effect, stated by Jones). 
By elementary integration we have also that if Q = ax” + 2hry + by’, the 
integral is 


i ct cll 
Vab — 12 \2 7 OV ab — I? 
TABLE 4 
Exact expected values 





a(1|4): Vx [(2/5)a + (2/5)c] 

a(2\4): / x [(2/5)a — (6/5)c] 

x(1!5): V/ x [(1/3)a +c] 

x(2/5): V/ x [(2/3)a — 2c] 

x(3|5): 0 

x(1/5)?: 1 +b +d 

2(2|5)?: 1 —dd 

x(3|5)?: ] —2b +6d 
x(1!5)a(2|5): b +d 
x(1\|5)x(3{5): 2a —2b — 2d —f 
“v(1|5)a(4|5): —2a +3f 
x(1|5)x(5|5): —2 
x(2|5)x(3|5): —2a +23b —d +f 
x(2|5)x(4!5): da —4b +4d —-4f 
x(1|6)?: 1 +h +3d 
a(2|6)?: 1 +b —9d 

x(3|6)?: 1 —2b +6d 
xr(1|6)2(2/6): b +3d 
x(1/6)x(3|6): 3a — 2b +3c —6d —3f 
x(1/6)x(4/6): —3a —9c +9f 
x(1.6)x(5|6): 12c —O6f 
x(1|6)x(6)6): —6e 

x(2 6)2(3/6): —3a +4b —3c +3f 
x(2\6)x(4/6): 9a , — 6b +9c +6d —1df 
x(2 6)x(5/6): — 6a —18c +18f 
x(3 6)x(4'6): —6a +6b — 6d +6f 





and if Q is ax + by + cz + 2fyz + 2gzx + 2hzy, the integral is 


/x Sr gh — af hf — bg fg — ch\ 
/—<-— + are tan —7—~ + arc tan —=— +- arc tan ~—— _?, 

4 A {2 Vad V/bA Vc 

, p2 2 2 
Where A = abe + 2fgh — af’ — bg — ch’. 

The author has not succeeded in obtaining similar results with a higher 
number of variables—it is possible that elementary functions no longer suffice 
then. 


LOW MOMENTS 285 


Using these results we can obtain exact expressions for ¥(1), ¥(2) and y(z, 7) 


for 1 <7,7;7 + 7 < 6, which give, in addition to Jones’ results, the exact expected 
values in Table 4, wherein 


a = 15/40 = 1.19366 20732, 
b = 5/3/4r = .68916 11193, 
c = (15/2n°) are sin (1/3) = .25824 50843, 
d = (5+/3/2n°) are sin} = .11085 93167, 


f = (15/7) are sin (1/+/6) = .63913 55493. 
REFERENCES 


{1] C. Hastines, Jr., F. Moste.uer, J. W. TUKEy ANnpb C. P. Winsor, ‘‘Low moments for 
small samples: a comparative study of order statistics,’? Annals of Math. Stat., 
Vol. 18 (1947). 

(2] J. O. Irwin, ‘‘The further theory of Francis Galton’s individual-difference problem,’’ 
Biometrika, Vol. 17 (1925). 

[3] H. L. Jongs, ‘‘Exact lower moments of order statistics in small samples from a normal 
distribution,’’ Annals of Math. Stat., Vol. 19 (1948). 








ON A THEOREM OF HSU AND ROBBINS 
By P. ERp6s 


Syracuse University 


Let fila), fo(v), --- be an infinite sequence of measurable functions defined 
on a measure space Y with measure m, m(X) = 1, all having the same distribu- 
tion function G(t) = m(x; f(x) < #). Ina recent paper Hsu and Robbins’ 
prove the following theorem: Asswme that 
200 

(1) | tdGi) = 0, 
| — 
r% 

(2) | dG) < x. 
~ 2 


Denote by S, the set («; Z. f(z) ; > n), and put M, = m(S,). Then ya M,, 


k=] ) n=1 


converges. 


? ” 
[t is clear that the same holds if Z. f(z) > nis replaced by > fx(x) | >e-n 
k=] 


k= 
replace fi.(a) by e-fi(x)). 
It was conjectured that the conditions (1) and (2) are necessary for the 


x 
convergence of >> M,,. Dr. Chung pointed it out to me that in this form the 


n= 


conjecture is inaccurate; to see this it suffices to put f,(~7) = 4(1 + 7x(a)) where 
r.(x) is the kth Rademacher function. Clearly ' f.(z), < 1; thus M, = 0, 


thus >, M,, converges, but [ tdG(t) ~ 0. On the other hand we shall show 


n=1 
in the present note that the conjecture of Hsu and Robbins is essentially correct. 
In fact we prove 
THEOREM I. The necessary and sufficient condition for the convergence of 


Z. M,, is that 


n=] 


a 


(1’) | idG(t) <1, 


oo 


and (2) should held. 


In proving the sufficiency of Theorem [, we can assume without loss of gener- 


ao 


ality that (1) holds. It suffices to replace f(a) by (f;{2) — C) where C = tdG (it). 


J—0 
The following proof of the sufficiency of Theorem I (in other words essentially for 
the theorem of Hsu and Robbins) is simpler and quite different from theirs. 
Put 

3) a; = m(ax; | f(x) | > 2°) 


1 Proc. Nat. Acad. Sciences, 1947, pp. 25-31. 


ON A THEOREM OF HSU AND ROBBINS 287 


since the f;’s all have the same distribution, a; clearly does not depend on k. 
We evidently have 


eas < Ma: — ain) < i] Pda) < 2" (a; — ain) < DP ay. 


i=0 i=0 i=0 i=0 
Thus (2) is equivalent to 
o 
{ 27 
(4) z. z a< @ 
i=0 
a ya’ 
la F <a <7. Put 
¥(1) rt \ | j—2 
Si = (23| f(x) | > 2°~, for at least one k < n), 
¥(2) , \ | 4/5 ! 4/% 
Sn = (3) fi, (2) | > n°, | fe.(x) | > n°, for at least two ki < n, ke < n), 
n ' 
¥(3) ea eer 
S® = (2; > fi(x) | > 2°), 


kel 
where the dash indicates that the k with | f;(7)| > n° are omitted. We 
evidently have 


S, c S®U s@ U s@ 


Ine 


a on . ° y(1) ¥(2) ¥(3 
For if x is not in SY U S® U S®, then clearly 


du fe(e)| < 2°* + 2° <n. 


k=l 


oo 
Thus to prove the convergence of _ M,, it will suffice to show that 


n=1 


co 
(5) > (m(S@) + m(S) + m(S®)) < x. 


n=1 


i+1 


From (3) we obtain that mS?) < n-ajy2 < 2°*?-a;2. Thus from (4) 


oO co eo 
(6) pm nN (S@) = z >> m(S < 2. optrs a; < oO. 
n=) i=) 2i<ncQitl +=) 


From (4) we evidently have that for large wu 
mix; | fx(z) | > u) < i/u’. 


Thus since the f’s are independent and have the same distribution function it 
follows that for sufficiently large n, 


mS) SY mas | Fix) | > n"™®, | feela) | > nr") 
-. ky<kox n 


_(n tas as idee cies oo ek elit el 
< (2) mas} fi(a)| > n°), ma; | folx) | > n*?) <n? nt? = 
Llence 
7. Dd» m(SY) < x, 

n=} 








288 P. ERDOS 


Put 


f(x) for | fix) | < ni”; 
0 otherwise. 


ft(e) = | 


Clearly the fi(x) are independent and have the same distribution function 
Gt(t). Put 


(8) [ae =6 a) =f@ - 


We have from (8) that / g(x) dm = 0, and by (1) thate ~Oasn— ©. We 
x 


evidently have 


n 4 n 
/ = ws(2)) dm = / = gi(x) dm + 6 / > g(x) -gi(x) dm. 
a k=1 X k=1 


Xisk<isn 


+ ° 4/5 
Now since max | g(x) | <n"? + «, 


/ gi(x) dm < (n**? + ©)? - [ gi(x) dm < q-n*”, 
xXx xX 


and 
/ gi(x)-gi(c) dm = [ din) den [ se on 
x & 7 
Thus 
n 4 as 
I (= is(2)) dm <en*”, 
x k=1 
Hence 
” m (2; De g(x) | > n/16) < en”. 
| k=1 


Thus from (8), (9), | fe(x) | < | ge(z) | + 1/16 (for « < 1/16) and n/8 < 2°” 
we have 


” (=: 2 f(x) | > a+) = mM (3: Dd fe (2) | > a+) 
k=1 | | oon 


<m (s; | 7. g(x) | > n/16) <an°™, 
k=1 


or 
(10) mS?) < an", 


Thus finally from (6), (7) and (10) we obtain (5) and this completes the proof 
of the sufficiency of Theorem I. 


ON A THEOREM OF HSU AND ROBBINS 289 


Next we prove the necessity of Theorem I, in other words we shall show that if 


eo 


z M,, converges then (1’) and (2) hold. 


n=1 

First we prove (2). The following proof was suggested by Dr. Chung, who 
simplified my original proof. By a simple rearrangement we see that (2) is 
equivalent to 


(11) af. dG(t) < « 


for any c > 0; while 


(12) [ \tlae@ < « 

is equivalent to 

(13) Sf a@<« 
n=1 4|t|>cn 


foranye > 0. Now we have clearly, 
(a; | fr(x) | > 2n) CS,1US,. 
Hence 


he [ — dG) <= LX (m(S,4) + m (8,)) < @. 


Thus we obtain (12). Since the terms of this series is non-increasing it follows 
that 


(14) " / dG(t) > 0. 
| t]>2n 


Our assumption being that = M, < « wehaveM,—-Oasn— ~. It follows 
that there is a constant p > 0 independent of k and n such that 


m (2: | > filc)| < n) = p. 
Ik 





Now, writing set intersections as products, we have 
n ) | 
le | CY 
U (a; | f(x) | > 2n)-(2; | De filx) | < ») ay. 
k=1 l=1 | 
lfk 
Writing this for a moment as 


U (Ri T:) C Sn, 


k=1 








290 P. ERDOS 


where R; = (x; | fi(x) | > 2n) ete. and denoting by R’ the complement of R 


we have 
M,, = m(S,) = m (U (Re-T1) ) 
k=] 


m (U (AT) «+ (Ria Ti)’ Re 1) 


J-——} 


? 


' 
| 


- Pm m((R, 71)’ «> * (Rea Te-1)’ Ri Ty) 


k=1 


7 


> > wif, --- Ri, R&T) 


k=1 
= z. {m(R,-T,) — m((R, U--- U R._)R} 
k=1 
> >> {m(T,) — (k — 1)m(Ri))m(R,)} 
k=l 
n nm 
> DY fo — nm(Ri)}m(R = DX (0 — o(1)) (Red. 
k=1 k=] 
ze 2 mR) = np’ / dG(t) 
k=l t}>2n 


by (14) since m(R,) = / dG(t), nm(R,) ~ Oas n> &, 
| ¢]/>2n 
Thus 


Hn | dG(t) eo < @, 
n t'>2n Pin 


Hence we have (11), which is equivalent to (2). The proof of (1’) is quite easy. 
3v virtue of (2) we can put 


a 


tiG(t) = C. 


v—ZX 
. 


It C > 1, then it follows from (2) and Tschebycheff inequality that MW, — 1 as 
n—+o,thusC <1. But if C = 1, we conclude from (2) and the central limit 
theorem that J/, does not tend to0. Hence C < 1, and (1’) is proved. 


s 


By similar methods we can prove the following results: Let 2 <¢ <4. Put 


M\’ =m (23 7s (x) | >n- ) 


| k=1 | 


x 


rim 


ae eas . a (c) 
Then the necessary and sufficient condition for the convergence of > MS 


ON A THEOREM OF HSU AND ROBBINS 291 


is that 


> 


a 0 


[ tacw = 0, I lil dad) < . 


If ¢ < 2 then the necessary and sufficient condition for the convergence of 


> M® is that | \t\'dG(t) < @. 


n=l 


Finally we can prove the following result: Assume that tdG(t) = O and 


a 
| t* dG(t) < «. Then there exists a constant r so that 


oo 


n=l 


0 ie , 
(17) os m E z. filz)| > nr” - (log ny | < 0, 
| k=l 


The case of the Rademacher functions shows that (17) can not be improved 
very much, in fact only the value of r could be improved. 








NOTES 
This section is devoted to brief research and expository articles on methology and 


other short items. 
(rene a ne ne nae 


BROWNIAN MOTION ON THE SURFACE OF THE 3-SPHERE 


By Késaku YOSIDA 


Mathematical Institute, Nagoya University 


1. Introduction. Let S be a n-dimensional compact riemann space with the 
metric ds’ = g;;(x) dx’ dx’ such that the totality G of the isometric transformations 
of S onto S constitutes a Lie group transitive on S. Consider a temporally 
homogeneous Markoff process by which P(t, x, y), t > 0, is the transition prob- 
ability that a point x is transferred to y after the elapse of ¢t-unit time. We 
assume that P(t, x, y) is a Baire function in (t, x, y) and continuous in ¢t, then P 
satisfies Smoluchouski’s equation 


(1.1) p+240 = [Pa 2, P(e, 2, y) de (t,s > 0), 
dz being the G-invariant measure +/g(x)dx' dx® --+ dx", g(x) = det(gi;(x)), and 
(1.2) P(t, x,y) 2 9, 

(1.3) [PG x,y) dy = 1. 

The spatial homogeneity of the transition process may be defined by 

(1.4) P(t, Tz, Ty) = P(t, z, y) for T €G. 


The “continuity” of the transition process may be defined, following after A. 
Kolmogoroff and W. Feller,’ as follows. Let Z,(S) be the function space of 
integrable (with respect to dx) functions f(@) on S, then, for those f(7) which are 
dense in 1,(S), 

L. & 

af(t, x) = A ‘f(t, xv), (t= 0); 

ot 

(1.5) 
ft, x) = [sre yt) dy, (t>0), (0, x) = f(a), 
Ss 


where, with non-negative b"*(x) 
ee eee eT 
(1.6) (Af) (x) = V/4(x) ay ( V/ q(x) a (x) f(x)) 


i ¢@ Tx) bil 


1 A. Kolmogoroff, “Zur Theorie der stetigen zufilligen Prozesse,”’ Math. Annalen, Vol. 
108 (1933); W. Feller, “Zur Theorie der stochastischen Prozesse,” Math. Annalen, Vol. 113 
(1937). 





292 


BROWNIAN MOTION 293 


The temporally and spatially homogeneous “continuous” Markoff process 
may, if it exists, be called a Brownian motion on the homogeneous space S. 
The purpose of the present note is to show that, under some derivability hypoth- 
esis concerning a‘(x) and b*(z), there exists one and (essentially) only one Brown- 
ian motion on the surface of the 3-sphere S*. 


I here express my hearty thanks to Dr. Kiyosi It6 who proposed to me the 
problem and discussed and much improved the manuscript. 


2. The defining equation for the Brownian motion. The spatial homogeneity 
(1.4) is equivalent to the fact that A is commutative with every operator 7’ de- 
fined by 


(2.1) (Tf)(x) =f(Tzx), TG, 
because we have 
[ fy) P(t, y, Tx) dy = [ f(Ty)P(t, Ty, Tx) dTy = [ f(Ty)P(t, y, x) dy. 
The condition (2.1) is equivalent to 
(2.2) XA = AX for any infinitesimal operator X = £'(z) a 


induced on S by the infinitesimal operator of the Lie group G. Thus, assuming 
the derivability of a*(x) and b’(x) of necessary orders, we obtain from (2.2) the 
conditions: 


in al 1 aG*(zx) 
(2.3) E(x) a (se ag = % 


(c'@) = — vat ate) + 2¥a@H"O), 








‘ dé (x) isp. O€ (x) 1 : 
(2.4) V/ 91 AMO + WO Bape FO zee) 


(H(z) = G'(z) ae < (Wate) b” (x)), 
(2.5) oa) ES) 4 yg) HO) @) = (a) SS. 


Now for the surface of the 3-sphere S’, 
ds’ = d@ + sin’d-dg’, g(8, ¢) = sin’d, 


and the infinitesimal operators 


" . 0 cos 6 cosy @ 
Xz; =sng — —_—_—_ — 

3+ snd ae” 
i «wet ~ Ss 8 


00 sin@6 dy’ 








294 KOSAKU YOSIDA 


respectively correspond to the rotations about the x-, y- and z-axis. 
From (2.5) we see that, by taking X = X,, 


2.6) b°(8, 2) is independent of ¢. 
By taking X = YX, in (2.4) we see that H* is independent of »y. Hence, by (2.6), 
(2.7) a'(8, ¢) is independent of ¢ 


Thus, by taking k = 1, X = X, we obtain from (2.4), 


sin 


1 sii - : . d 
Sithinwcan 7 0s oO bb s y 2 2 — 
; H’(@) cos « (6) sin ¢ ing $ (x ag 1") 


and thus 
(2.8) H*@) =0, w'@)+- S (a5 H 0) = 0, 


Hence, by taking k = 2, YX = X,or X = Xy, we obtain from (2.4) 








— H'(@) cos @ 4 apt @) 2 es Acoso 1 oy aes sin : _ x@ © 8 cos ¢ 





: : - =Q, 
sin? @ sin @ 
1 2 
H (6) sin ¢ — 20") ances 2p?(6) ws + 626) cos 6 sin ¢ _ 
sin? 6 sin? 6 n? 6 sin 6 


From these two equations we obtain 


H’ a cos 2 cos : 
in? 


(2.9) b”°(6) = 0 — 2b "O —5, + b° “@) = =n, ae 
By taking 7 = 2,/: = 1, X = X,, we obtain from (2.5), (2.9) 


b” (8) cos g + b"(6) (sasom ‘) = 








sin 6 
and hence 
5 22 7, b(6) 
10) b°(@) = 
(2.10, _— sin? 6" 
Similarly by taking 7 = 1, = 1, X = X, we obtain from (2.5) 
ll; . 
. . ; ) 
b”(@) cos ¢ + b"(8) cos ¢ = sine db 9, 
dé 
and hence by (2.9), (2.10) 
(2.11) b"(6) = constant C, b? (6) = ai 
sin* @ 


Thus we obtain from (2.4) 


BROWNIAN MOTION 295 


H'(0) = —a'(@)sin@+2C cos 6, H°(0) = —sin 6-a°(9) 
and thus, by (2.8), 
(2.12) a’(6) = 0. 
Substituting (2.11) in (2.9) we obtain 
(2.13) je ~« 
sin 6 


Therefore since b"(@) and b”(@) are non-negative, A is (essentially) equal to 
the Laplace operator 
ee 1 & 
2.14 hb woe ee he <..., 
( ) sin 6 06 _ 00 . sin? 6 d¢° 





Thus we may obtain P(t, x, y) by integrating the equation 


= A-f(t;0,¢), (t= 0), 


ade AO 





t; 0, ¢) 
(2.15) af 9, 

df 
and by putting 


(2.16) ft; 9, ¢) = ft, x) = [, FY)PE, y, x) dy. 
8 


3. Integration of the equation (2.15)-(2.16). Consider the Laplacian (real) 
spherical harmonics 
(3.1) Y{"(6, 9) = Yi" (@), (—k Smsk;k =0,1,--:). 


They constitute an orthonormal function system complete for continuous 
: 3 
functions on S", and we have 


(3.2) A-Yi"(6, ¢) = —k(k + 1)Y£"(, ¢). 


Since, as is well-known, 


k 
‘ >(m) —1 k 7 ) 
(3.3) Yir(T a2) = DS uk) VE (a) 

n=—k 
by an irreducible orthogonal representation (u?(T)) of the rotation group G, 
we have 

. 
(3.4) max | Y{"(x) | < (2k + 1) min D> | Y{(2) f, 
z z n=—k 

by applying the Schwarz inequality and the transitivity of the group G on 
S°. The right hand member satisfies, by the orthonormality 


(3.5) (2): + 1)°/(area of S’). 


Therefore the double series (for ¢ > 0) 











296 ARYEH DVORETZKY 


oo k 


(3.6) P(t; 6, 936’, ¢') = 2d, exp (—k(k + 1) YL", o) YL" @’, ¢’) 
k=0 m=— 
is absolutely and uniformly convergent on S°*. We will show that this P is 
the required (unique) Brownian motion on 8S’. 
The proof may be given in three steps. i) We see by (3.2) and (3.6), that 


[ f(y)P(t, y, 2) dx satisfies (2.15) if 


f(z) ~ nt te = af” Y{" (2), 2d . exp (—k(k + 1)d)k(k + 1) df” ¥{ (a) 
=0 m=—k <=0 m=—k 
are both absolutely and uniformly convergent. By the completeness of { Y{" (x)}, 
such f(x) are dense in L,(S). 
ii) Because of (3.3) we see that (3.6) satisfies the spacial homogeneity (1.4). 
iii) (1.3) is obvious by the orthonormality of {Y{"(x)} and the constancy 
on S° of Y§(x). Next, for the solution f(t, x) of (2.15)-(2.16), let f(z) = 
f(0, x) be non-negative on S’, then g.(t, ) = exp(—  d)f(t, x), (e > 0), satisfies 


dg.(t, x) 


at = A- “g(t, x) — eg.(t, x), (t > 0), 


g.(0, x) = f(x) 20 (on S°). 


Thus g.(t, z) = 0 on S*, since g.(t, x) cannot have a negative minimum on the 
product space [t; , te] X 'S? , for any t > 4, > 0. For at such minimizing point 
we must have 





a9. dge ge Og. ao 
— = = 0 = 0 —- £© 
at ; 00 , de . 06? , de 


& 


€ 








IV 
IV 


0. 


to! 


Therefore, since « > 0, t2 > t; > O were arbitrary, we conclude that f(t, 7) 2 
0 on S’ for t > 0 if f(z) = Oon S*. This proves (1.2). The same argument 
simultaneously shows us that the solution P of (2.15)-(2.16) and (1.2)-(1.3) is 
unique. 


| nan an sms 


ON THE STRONG STABILITY OF A SEQUENCE OF EVENTS 


By ARYEH DVORETZKY 
Hebrew University, Jerusalem, and Institute for Advanced Study 
1. Summary. M. Loéve [3] has found conditions under which a sequence of 
events which may be interdependent in an arbitrary manner is strongly stable. 


In this note it is established that considerably weaker conditions imply the 
strong stability. 


2. Introduction. Let 
(1) 





STRONG STABILITY OF A SEQUENCE 297 


be a sequence of events, which may depend on each other in any way whatsoever, 
defined on the same set of trials. 

Let R,, be the repetition function of (1), i.e. R, is the number of those among the 
first n events: A;, Az, --- , A, which were realized, and put f, = R,/n. The 
random variable f, is called the frequency function of (1). 

Denoting by E{x} = # the expected value of zx it is evident that 


- = y 1 
R, = E{Rn} = 2, Pr (Ai), fa = Elfa} = = E{ Ral. 
Following Loéve [3, p. 252] we say that (1) is strongly stable if the sequence 


on = fn — fn (n = 1, 2, ---) is strongly stable in the usual Kolmogoroff sense 
{l, p. 58], i.e. if 


(2) lim Pr (sup |¢,| > «€) = 0 
n>. v>n 
for every e > 0. 
Putting’ 
s. = = Pr (A,), %. = ssgaeinas > Pr (A,A,) 
n fat n(n — 1) 1<0Soen ° 


and introducing the abbreviation” 


5, = Yn — Ba, 


Loéve’s result [3, pp. 257-9] is the following: 

If nb, is bounded then (1) is strongly stable. 

This, even when specialized to sequences of independent events, includes the 
Bernoulli and Poisson cases. 

Here the following stronger result will be established. 

THEeorEM. Jf > 6,/n is convergent then (1) is strongly stable. 

In particular, if for some e > 0 the sequence ‘6, is bounded then (1) is strongly 
stable. 


3. Alemma. The new tool here used is the following simple result on series of 
positive terms. 
Lemma. Leta, > 0 forn = 1, 2, --- and 


ec 


an 
(3) LL 
n=1 71 
be convergent. Then there exists a sequence n; of integers satisfying 
(4) 0 < Ninn — 2s = O(N) (i— «), 


and such that the series D;=1 an; 18 convergent. 


1 4,A, denotes the event: both A, and A,. 
2 Our Bn, Yn and 6, correspond to Loéve’s pi(n), p2(n) and d? respectively. 








298 ARYEH DVORETZKY 


Proor. Since (3) is convergent it is well known’ that there exists a sequence 


of numbers /,(n = 1, 2, ---) satisfying 
(5) a liml, = x 
n=o 
having the property that 
ee 
(6) a te ew, 


We define inductively a sequence of integers m(7) through 
is ’ ' ‘ mi 
7 m(1) = 1, mii + 1) = m(ii)+1+4+ Kai 

bm (i) 

the square brackets denoting the integral part. Clearly 
(8) 0<m(t+ 1) — m(t) = o(mi)). 

Now for every 7 we choose n; so that 

m(i)< nj <mt+1) and a,; = min dy. 
m(ijSv<m(itl) 

These n; satisfy the requirements of the lemma. 

Indeed, (4) holds in virtue of (8) while applying (5) and (7) we obtain 


m(i+1)—1 


s= »D |, > (mi +) — a me) a 
paoen (4) v mii+l) ~mai+1) * 
Since = s; converges by (6) it follows from the preceding inequality and (8) that 
Dan; < © as required. 
Coroutuary. The conclusion of the lemma remains valid if the condition a, > 0 
is dropped provided (3) is absolutely convergent. 


4. Proof of the theorem. An easy calculation [3, p. 253] gives 


~ 


z > B — f, 
on = Ef(fa — fn) } = bn + ~~ 


Since both 8, and y, are between zero and one we have 


I 2 " 1 
—-—- <on — On < -. 
n n 
Therefore it follows from the assumption of the theorem that = (¢7,/n) is con- 
vergent. Hence by the lemma there exists a sequence of integers n; satisfying 
(4) and such that > ¢;,, converges. 


am 7 c = an a ae? ‘ } 
’ Take e.g. la = (Zpsn va,)~* (ef. [2, p. 299)). 


STRONG STABILITY OF A SEQUENCE 299 


Applying Tchebytcheff’s inequality to ¢,, = fn, — fn, and adding for v > i 
we have for every « > 0 


(9) Pr (sup | gn, | > €) < 5. 


vt 


Ifn; <n < nex, then 


Ry _ Be 


n Nn; 


‘ 


< Nisin — 1 
ni; 


| fn — fang | ae 











Denoting the last term of this inequality by ¢«; and putting é; = max,>; ¢,, we 
have from (9) 


Pr (sup |¢,| > € + 28) < aw. 
n=nj € y= 

As é; — 0 and the right hand term is the remainder of a convergent series, (2) 

follows and the theorem is proved. 


5. Remarks. 1. The lemma used here can also be applied to the study of 
the order of magnitude of ¢, in the almost certain sense. 

2. If the terms of (3) are decreasing then the existence of a convergent sub- 
series of © a, satisfying (4) implies 27-1 a2i < ©. But this is equivalent to the 
convergence of the series with monotone terms (3) (ef. e.g. [2, p. 130]). Hence 
in this case the convergence of (3) is necessary as well as sufficient for the validity 
of the lemma. It may be possible to use this ‘remark in order to establish in 
some special cases, where the interdependence of the variables decreases steadily 
in a suitable sense, necessary and sufficient conditions for strong stability. 

3. The sequence of 6, is of course, of very specialized structure. Thus, since 
the stability of (1) is equivalent [3, p. 255] to 6, — 0 and is implied by strong 
stability, it follows that 5, — 0 whenever = (6,/n) is convergent. 

Added in proof: Since this paper was submitted I heard from Professor M. 


Loéve that he has independently obtained the theorem of section 2 by another 
method. 


REFERENCES 
{1] A. Kotmocororr, Grundbegriffe der Wahrscheinlichkeitrechnung, Ergeb. d. Math. Vol. 2, 
no. 3, Springer, Berlin, 1933. 
[2] K. Knopp, Theory and Applications of Infinite Series, Blackie, London and Glasgow, 
1928. 
[3] M. Lofve, ‘Etude asymptotique des sommes de variables aléatoires liées,”” Jour. de 
Math. pures et appl., Vol. 24 (1945), pp. 249-318. 








300 K. S. BANERJEE 


A NOTE ON WEIGHING DESIGN 


By K. 8S. BANERJEE 


Pusa, Bihar, India 


1. Efficiency of weighing designs given by a three-fourth replicate. In the 
June issue of the Annals, Kempthorne [1] approached the construction of the 
orthogonal matrix X through fractional replicates, the original treatment of 
which was given by Finney [2]. Reference has been made to the use of a three 
fourth replicate for weighing designs. Details for such designs have not been 
furnished as their efficiency is lower than for the designs given by the com- 
pletely orthogonal matrix X. Ina three fourth replicate the treatment combina- 
tions have to be chosen in a particular manner for a comparatively easier 
analytical treatment both from the point of view of agrobiological experiments 
as well as weighing designs. The variance of each of the estimates in such a case 
will be o’/2””. As a matter of fact, in a weighing design given by a fractional 
replicate of the type of (2? — 1)/2°, (@ = 1, 2, --- n), of 2” experiments, the 
estimate of the variance of each object is independent of the fraction used and 
is equal to o°/2”*, the same as above. 


2. Construction of a three fourth replicate. [Kempthorne mentions that a 


factorial design of fraction $ could be taken to consist of a 3 replicate on the 


«- 


identity J = ABC and a quarter replicate based on the identity 
I= A= BC = ABC. 


If the half replicate based on the identity J = ABC be taken to consist of all the 
treatments corresponding to the minus signs of the treatment contrast ABC [3], 
the additional quarter replicate can be chosen in two different ways. When 
however the treatments corresponding to the minus signs of both A and BC 
are kept, omitting the treatments corresponding to the plus signs of A and BC, 
the three fourth replicate so obtained will have certain advantages, which will 
not be available if the quarter replicate to be added is chosen to consist of the 
treatments corresponding to the plus signs of A and BC. 


3. Behavior of the contrasts in a three fourth replicate and the efficiency of 
the weighing designs. In general, if there are n treatments giving rise to 2” 
treatment combinations and if the defining contrasts be chosen as 


I = ACD = BDE = ABCE, 


it will be necessary to omit the treatment combinations corresponding to the 
plus signs of both ACD and BDE, which will be 2” in number. In the three 
fourth replicate so obtained, 2” treatment effects (inclusive of the mean) will 
divide themselves into sets of 4 treatment contrasts each. One of the sets will 
be J, ACD, BDE and ABCE and any other set will be formed by multiplying 
any treatment contrast by the defining set namely, 7, ACD, BDE and ABCE. 
Only three contrasts out of four in a set will be independent, so that only one of 


— he CY CNY 


.-” US 


— 


WEIGHING DESIGN 301 


the contrasts, preferably the one of the highest order interaction may be kept 
as an alias (in agrobiological experiments) of the remaining three and may 
therefore be omitted. Each of the four contrasts within a set will be orthogonal 
to each of the other contrasts in the remaining sets, but within a set the four 
contrasts will be non-orthogonal to one another. Though non-orthogonal, the 
normal equations will be of the systematic type’ and the matrix X’X, taking 
any three contrasts out of each set of four, will take the following form: 


xaaodgoodd0o9o 9d 
a a 0000 0 0 
aazxodoodo0d0o0 090 
0002 aado0 0 
000ae«ea0d0o0 0 
(1) 000aazx O00 0 
0000002 aa 
00000 0anza 
00000 0aa ez 


where the order of the matrix N = 32” is of the form 3t(t = 2” *) and z = 3.2”, 
a= —12"+ 12" = —2"”. The value of the above determinant = (x — a)" 
(x + 2a)‘ and that of the determinant suppressing the first row and the first 
column = (x — a)" (x + a)(x + 2a)". a = (x + a)/(x — a)(x + 2a) = 
l (2, substituting for x and a. The variance of each estimate will therefore 
be o /2”™. 


4. General case. When a fraction of the type a/2° = (2° — 1)/2° is used, 
the treatment combinations corresponding to the plus signs of the 6 independent 
contrasts is omitted. Out of each set of 2° treatment contrasts, only a = 2° — 1 
will de independent and the matrix will then take a form like that of (1), where 


a = [(2° — 1)2"]/2° = 2”-*(2° — 1) and 

—} 2" + [(2* — 1)2"/2? = -2"%, 
a = [x + (a — 2)al/(x — a) [x + (a — 1)aj = (2-2 *)/2"2"* = 1/2". 
The variance of each estimate = o /2”", the same as before. When a com- 
pletely orthogonalised matrix of the order (a2”)/ 2° — 2” *(2° — 1) is available, 
the variance of an estimate will be o/2” °(2° — 1). The ratio of the two 
variances = 2” '/(2” — 2”*) = 2°*/(2° — 1), which shows how the efficiency 
of the weighing design decreases with the increasing value of the fraction. 


When 6 = 1, i.e. in a half replicate, the efficiency is 100 percent. The value of 
the fraction is never less than 3. 


a 








1 The analysis of the data available from agrobiological experiments will not be cumber- 
some to a prohibitive extent as in many other experiments where non-orthogonality creeps 
in. The results of investigation in this direction have already been communicated for 
publication elsewhere. 








302 K. S. BANERJEE 


5. Independence of the estimates given by Ly in a biased spring balance. 
Kempthorne mentions that although the optimum designs for the spring balance 
case suggested by Mood furnish somewhat smaller variance than what is given 
by fractional replicates, these designs have the disadvantage that the estimates 
are correlated, whereas the estimates furnished by fractional replicates are 
orthogonal. The designs furnished by fractional replicates take account of the 
bias and if the weighing operation corresponding to the bias is omitted (in case 
where the spring balance is free from bias), the resultant scheme will fail to give 
independent estimates and the variance factors will be of the same magnitude 
as in the optimum design Ly of Mood with the same number of weighings. 
Again, these optimum designs may also be made to furnish independent estimates 
when the designs are adjusted in the manner as suggested by Mood to suit a 
biased spring balance. 

It is true that the design matrix L; given by 


1 1 0 
X = ; & 3 
0 1 1 


does not give independent estimates as such; but when it is assumed that the 
spring balance has a bias and the design matrix is modified as follows: 


100 0 
ti 16 

(2) 7 3s 4 @ OF 

10411 

the estimates except that for the bias will be orthogonal to one another and the 

variance of the estimated weights will necessarily be larger in value. 

Before proving the general case, we notice that when —1 is substituted for 0 
in (2) above, the resultant scheme will be an orthogonalised matrix. This is 
true not only in this particular instance but will hold good also in general. The 
constitution will be clear when the method of construction of Ly from Hy4 
is recalled. 

The distribution of ones in Ly gives a special type of symmetrical balanced 
incomplete block design, where r = k = 3(b + 1) andA = i(6b + 1), while the 
distribution of zeros gives the complementary design for which 795 = r — 1, 
ko = k — land >» = A — 1. Therefore when a row of zeros and a column of 
ones (in that order) is added to Ly , the matrix X’X of the resultant scheme takes 
the following form: 


N+1 errr r 
r r A eee AX 
(3) r A rrAvcre AX 


eee ee eee eee eee eee eeeee 


eee e eee eee eee eee eee eee 


\w 


- O88 WF = 


ls 
of 


S 


WEIGHING DESIGN 303 


Making use of the identities well known in the theory of balanced incomplete 
block designs and remembering the relationships, 2\ = r = k = 3(N + 1), 
(I) The value of the determinant of 


X'X = (r —d)*"[(N + 1) fr + AN — 1)} — PN) = (r — 2) fr + UN — DI, 
(II) The value of the determinant suppressing the first row and the first 
column = (r — d)* “[r + A(N — 1], 
(III) The value suppressing the second row and the second column 
= (r — r)*[(N + 1) {r + AN — 2)} — r(N — 1)! 
= (r—d)" "fr + AW — DI, 
(IV) The value suppressing the first row and the third column 
(r — r»)* 7 [r{r + AN — 2)} — rA(N — 1)] 
= r(r — 4), 
(V) The value suppressing the second row and third column 
= (r — )*°A(N + 1) - 7] 


I 


= Q, 
Hence, the reciprocal matrix of X’X will be given by 
| =i =-5h ss «28 
—1/k 2/k 0 vee 0 
() Ne ee secs sinnest er veceravcseres 
—i/k 0 0 2/k 


Let Y’ denote the column matrix of the results of the weighings, yo , 1, --* , Yw 
and B’ the column matrix of the estimates of the weights bp , bi, --- by. Then 
the estimates will be given by the’ equation 

B’ = [X'X]''X’Y’. 
It is easy to see that all the rows except the first in [X’X]‘X’ are orthogonal to 
one another. To explain this, let us take the design given by (2). Here 


1111 
0110 
Pate 
~- 01041 
0011 


Then [X’X]"X’ will be of the form 


= 0 0 0 
—1/k +1/k +1/k —1/k 
—1/k +1/k —-l/k +1/k 
—i1/k -—1/k +1/k +1/k 
In all the rows excepting the first, for every 0 and +1 in X’, there will re- 
spectively be a —1/k and a +1/k in [X’X]"'X’. It has been mentioned before 








304 K. S. BANERJEE 


that an orthogonal matrix is obtained when —1 is substituted for every 0 in X 
or X’. Hence, N rows (all except the first) of [X’X]'X’ will be orthogonal 
and these N rows will estimate the N weights in orthogonal linear combinations 
of Yo, Yi-** Yn. 

It has been mentioned before that the distribution of zeros in Ly gives the 
complementary design, for which 7 = r — 1, Ko = k — l and % =A — 1, 
If to such a design, a row of ones and a column of ones (in that order) be added 
to suit the estimation of the weights in a biased spring balance, exactly a similar 
situation will be obtained and the estimates will be orthogonal. It can readily 
be seen that the design furnished by Yates to weigh seven light objects and a 
bias is an illustration of this kind. The scheme given by Yates is the comple- 
mentary design of Z; with an additional row and a column of ones added to L;. 

The sixteen combinations of ten objects, a, b, c, d, e, f, g, h, k, l include 1, 
which corresponds to weighing with empty pans or, in other words, which is 
devoted to estimating the bias. When 1 is omitted, X’X will be of the form 


rr¥HA A 
x TF ® A 
x & £ r 
A AA T 


where r = 8andA = 4. The above matrix X’X is obviously of the same form 
as given by Li; . 

By following exactly the same procedure as given above, it can easily be seen 
that when the weighing operation 1 is included in the weighing design, the 
solution of the normal equations will lead to independent estimates. The 
absence of each letter will be a 0 and the presence a + 1 in the design matrix 
and if —1 is substituted for every zero, the resultant matrix will be orthogonal. 
In some cases, however, the number of letters in all the combinations will not be 
the same, i.e. & will not be constant. In such a situation, k in (4) will take the 
value of r or of 2A. 


REFERENCES 


[1] O. Kempruorne, ‘‘The factorial approach to the weighing problem,” Annals of Math. 
Stat., Vol. 19 (1948), pp. 238-245. 

[2] D. J. Finney, ‘‘The fractional replication of factorial arrangements,’’ Annals of 
Eugenics, Vol. 12 (1945), pp. 291-301. 

[3] F. Yates, Tech. Commun. Bur. Soil Sci. Harpenden no. 35 (1937), p. 11. 

[4] Harotp Hore uine, ‘Some improvements in weighing and other experimental tech- 
niques,’’ Annals of Math. Stat., Vol. 13 (1944), pp. 297-306. 

[5] K. KisHen, ‘‘On the design of experiments for weighing,’’ Annals of Math. Stat., Vol. 14 
(1945), pp. 294-301. 

(6] A. M. Moon, ‘‘On Hotelling’s weighing problem,’’ Annals of Math. Stat., Vol. 17 (1946), 
pp. 482-446. 

[7] R. L. Puackxertr ann J. P. Burman, “‘The design of optimum multifac orial experiment,” 
Biometrika, Vol. 33 (1946), pp. 305-325. 


~~ 


~~ 


CONTROL CHART 305 


CONTROL CHART FOR LARGEST AND SMALLEST VALUES 
By Joun M. Howey 


Los Angeles City College 


1. Introduction. It may at times be desirable to use a control chart for 
largest and smallest values (L & S) in place of the conventional charts for 
averages and ranges (X & R). The chart for largest and smallest values has 
certain advantages: all informatiou. may be combined on one chart, computations 
are simple, and specifications may be placed on the chart. In this paper, 
constants for the use of this chart are developed and comparison is made with 
the average and range charts. 


2. Constants for determining limits. Let L and S denote the largest and 
smallest values, respectively, in a sample of n pieces, and let Z and S denote the 
averages of these values for k samples. Then (L + S)/2 and (ZL — S)/d are 
unbiased estimates of the population mean and standard deviation, respectively, 
in the case of a random sample from a normal population. The value of the 
constant d2 is given in [1] and repeated in table 1 for convenience. If we denote 
(L + 8)/2 by M and (L — 8) by R, control limits may be determined in terms of 
these statistics. 

In conformance with usual control chart practice, we will set the upper control 
limit at Z + 34, and the lower control limit at S — 3s, where 6, is an estimate 
of the standard deviation of the largest values in samples drawn from a normal 
population, and similarly for ¢s. The results of Tippett [2] and Pearson [3] 
for E(R) of samples from a normal population were used to determine expected 
values of L and S: E(R) = dao. Here, FR is the range of samples of size n: 
R=L-—S. Butsince E{(L + S)/2] = a for a symmetrical distribution, then 
E(L) = a + doo/2 and E(S) = a — doo/2, where a and o are the mean and 
standard deviation of the normal population from which samples are drawn. 

The probability element of the largest value [4] is given by: 


n[F(L)]" “f(L) dL where f(x) = 1/V/2ac0e °° and F(z) = | fly) dy. 


Then E(L’) = n[ L[F(L)|" ‘f(L) dL. Integrals of this type, differing only 


by a constant factor have been evaluated by Hojo [5] and from his results ds was 
determined so that oz = os = dyo. Values for dy for n = 2, 5, 10 are also given 
by Tippett [2]. ‘‘Three-sigma’”’ control limits may then be given in the form: 
M + A3R, where A3 = 0.5 + 3ds/d2. The expected value of the upper control 
limit will then be: E(UCL) = a + Ago, where Ag = (d2/2) + 3d. Values of 
these constants for various sample sizes are given in Table I. 

In practice, it might be desired, in the case of control charts for individual 
measurements or for L and S, to have E(UCL) = a + 3c, and the lower control 
limit symmetrically placed with respect to the central line. In this case, the 
formula for the limits would be: M + 3R/d: or M + ~W/nA2R, where Ay = 








306 JOHN M. HOWELL 


3/(d2v/n) is given in [1]. Since the efficiency of M decreases rapidly with 
increasing sample size [6], it would probably be better to use X in place of 
M for determining the central line for a control chart when the sample size is 
greater than five. X is the ‘average of averages” as defined in [1]. 

The chart for largest and smallest values would then consist of a chart on 
which both the largest and smallest values are plotted, with the central line at M, 
and the limits as given above. 


3. Comparison of charts for a particular case. \ comparison of the L &S 
chart with the X chart for a particular case in which the sample size was three is 
given in Fig. 1. Measurements were the shear strength of spotweld coupons of 


TABLE I 
Constants for largest and smallest value chart 


n d; ds A: As A | on 
2 1.128 825 1.880 2.72 3.03 | 2 
3 1.693 .748 1.023 1.82 3.09 | 8 
4 2.059 .709 .729 1.53 3.15 

5 2.326 670 517 1.36 3.17 | 5 
6 2.534 .648 .483 1.27 3.21 | 6 
7 2.704 627 419 1.20 os en 

| 

8 2.847 614 373 1.15 3.26 | 8 
9 2.970 .600 .337 1.10 32% #6 «CO 
10 3.076 588 .308 1.07 3.30 | 10 


aluminum in pounds. Since the range chart had no points above the ‘‘three- 
sigma” control limit and showed no other peculiarities, it has been omitted. 


4. General comparison of charts. We assume a mean of zero and a standard 
deviation of unity as a “‘given standard,” and then compute the probabilities 
when the true values are a and o. The probability of a point being inside of 
‘3-sigma’”’ control limits on the range chart under these conditions is: 
P, = Pr(R < d2D,/c), where D, is given in [1]. The probabilities for the 
range used here were found from the Pearson-Hartley tables [3]. The usual 
normality assumptions are made. 

The probability of a point being inside of ‘‘3-sigma” control limits on the 
average chart under the same conditions is: 

Vale ((3/+/n)—a) ) . ) 1 - 
P, = [ t) dt where t)= =e ”. 
Vale ((~0te/a)-<) eM 2e 
Since Daly [7] has shown that the average and range of samples from a normal 


rea Uw wt 


Oe 


OS 
of 


1e 
al 


1€ 


SHEAR STRENGTH OF SPOTWELD COUPON IN POUNDS 


CONTROL CHART 307 


CHART FOR LARGEST AND SMALLEST VALUES 






in 
700 UPPER LIMIT 682.4 


CENTRAL LINE 576.8 
650 


500 
LOWER LIMIT 4716 


450 


we wm wm wm mw ew eee em ewe ee ee ee ew we Owe em eee ee ee eee ee eee wee eee ee ee ee ee were 


5 10 15 20 25 30 
CHART FOR AVERAGES 


700 


650 UPPER LIMIT 639.7 


AVERAGE 5788 
600 Z 


550 


LOWER LIMIT 5179 


500 





5 10 15 20 25 30 
SAMPLE NUMBER 


Fia. 1 





308 JOHN M. HOWELL 


TABLE II 



































n | a | Co P | P, P;P2 Ps; Ni | N2 
3 | 0 1.0 | .994 .997 .991 .991 510 510 
| | 1.2 | .973 988 | .961 .963 116 122 
| | 1.5 | .901 955 | .860 | .868 31 33 
| 2.0 | .721 .866 | .624 645 10 11 
3 | 0.5 | 1.0 | .994 | .983 | .977 | .980 198 228 
| 1.2 | .973 935 935 | .939 69 74 
| | 1.5 | .901 917 | .826 | .834 25 | 27 
| 2.0 | .721 .830 | .598 | .694 9 |} 1B 
| | 
3 | 1.0 | 1.0 | .994 | .998 | .893 | .931 | 41 | 65 
| | 12 .973 .855 .832 .860 25 | 31 
| } 1.5 | .901 | .802 | .723 | .740 15 17 
| | 2.0 | .721 746 | .588 | .550 8 8 
3 | 2.0 | 1.0 | .994 .323 .321 .590 5 9 
1.2 | .973 852 | .3842 | .510 5 7 
1.5 .901 .378 | .341 .414 5 6 
2.0 | .721 .408 | .204 | .321 4 5 
| 
5 | 0 1.0 | .995 .997 992 | .992 570 | 570 
1.2 .969 988 | .957 957 105 | 105 
| 1.5 | .855 | .955 | .817 | .878 23 | 36 
| | 2.0 | .588 .866 | .509 45 7 8 
5 | 0.5 | 1.0 | 995 | 970 | .965 | .980 | 130 | 227 
| | 1.2 | .969 | .942 | .913 | .927 | 51 | 62 
| |} 1.5 | .855 | .891 762 | .791 | 17 20 
| 2.0 | .588 | .805 473 | «2505 | 7 7 
5 | 1.0 | 1.0 | .995 | .776 | .722 | .923 | 15 58 
1.2 | .969 | .736 | .713 | .828 14 25 
1.5 855 .695 | .594 661 | 9 12 
2.0 588 .648 .381 426 5 | 6 
5 | 2.0 | 1.0 995 .071 071 .512 2 7 
1.2 969 .110 107 402 3 6 
1.6 855 .164 140.286 3 4 
2.0 


.088 . 230 .135 . 185 3 3 





SUFFICIENCY, TRUNCATION AND SELECTION 309 


population are independent, the probability that a sample is within control 
limits on both charts is the product of the probabilities: P,P.. Thus the 
probability that a sample be outside of control limits on either chart is 1 — P;P:2. 

The probability of the largest and smallest values both lying in the interval 


(c—a)/o n 


from —c toc is: P3 = Pr(—ce < S,L <c) = lf g(t) ar| . Values of 
( 


—c—a) lo 
this expression with lower limit — © are given in table XXI of [8] for sample of 
sizes 3,5, and 10. For the purpose of comparing the charts, we choose c so that 
the probabilities of Type 1 errors are equal, that is:1 — P,P, = 1—P3o0r P\P2 = P3 
when the mean is zero and the standard deviation uniiy. Substituting in this 
equation and solving, we find: F(c) = 0.5 + 0.5 (.9973P;)"", where F(x) = 


z 


| g(t) dt. Forn = 3,c = 2.99 and for n = 5, c = 3.15. 


Comparing P,P: with P; when the true values are a and o will then show the 
relative power of the X & R charts and the L & S chart for detecting lack of 
control. 

Finally the charts are compared by finding the number (N; for the X & R 
charts and N; for the L & S chart) of samples which will detect lack of control 
with a .99 probability under the conditions given above. This is done by 
finding the smallest integer which satisfies the following inequalities: (P,P2)*! < 
01 and P3* < .01. As may be seen from table II, under most conditions, the 
L & S chart is nearly as good as the X & R charts for detecting lack of control. 


REFERENCES 

{1] AMERICAN STANDARDS AssocraTION, Control Chart Method of Controlling Quality during 
Production, Z1.3—1942. 

[2] L. H. C. Tippert, ‘‘On the extreme individuals and the range of samples taken from a 
normal population,’’ Biometrika, Vol. 18 (1925), pp. 364-387. 

[3] E. S. Pearson, ‘‘The probability integral of the range in samples of n observations 
from a normal population,’’ Biometrika, Vol. 32 (1942), pp. 301-308. 

[4] S. S. Wiiks, Mathematical Statistics, Princeton University Press, 1943, p. 91. 

[5] Hoso, “Distribution of median from a normal population,’’ Biometrika, Vol. 23 (1931), 
p. 315. 


(6] W. A. SHewnart, Economic Control of Quality of Manufactured Product, D. Van Nos- 
trand Co., 1931, p. 282. 

(7) J. F. Daty, ‘On the use of the sample range in an analogue of Student’s t-test,’’ Annals 
of Math. Stat., Vol. 17 (1946), pp. 71-74. 

[8] Kart Pearson, Tables for Statisticians and Biometricians, Cambridge University 
Press, 1914. 


(a RR a 
SUFFICIENCY, TRUNCATION AND SELECTION! 
By Joun W. TuKEY 
Princeton University 


1. Summary. The fact that the mean and variance were sufficient statistics 
for a univariate normal distribution truncated at a fixed point was known to 


1 Prepared in connection with work sponsored by the Office of Naval Research. 








310 JOHN W. TUKEY 


Fisher by 1931 [2]. Hotelling [3] has recently observed the corresponding fact 
for the truncated multivariate normal distribution. 

It is the aim of this note to point out that these are special cases of a general 
result, namely: If a family of distributions admits a set of sufficient statistics, then 
the family obtained by truncation to a fixed set, or by fixed selection, also admits the 
SAME set of sufficient statistics. 


2. Representation. ‘The basic formal results about sets of sufficient statistics 
are due to Fisher [1], whose arguments, with obvious modifications, establish 
that families of distributions satisfying the usual conditions have sufficient 
statistics. The converse was established by Koopman [4] for a reasonably wide 
class of families. 

The usual condition can be easily handled and given wide application by 
representing the family of distributions in a form suggested to the author by 
Rubin, and ascribed by him to Cramér, namely: 


dF (x | 0) = c(6)f(x | 6) du(x), 


where x is a possibly multidimensional chance quantity (i.e. random variable), 
6 is a possibly multidimensional parameter, c(@) is a positive real function of 6 
which serves to normalize the distribution, f(x | 6)—the relative probability 
density—is a non-negative real function of x and @, and u(x) is a positive measure 
function. In this representation the natural and sufficient condition that 
‘hi(z)} are a set of sufficient statistics for 6 is the existence of functions a;(@) 
such that (cf. Koopman [4]) 


(1) ee = > as@)hg(z). 


When 6 is a vector, the derivative is to be interpreted as the gradient (a vector) 
and the a;(@) are to be vector-valued functions of 6. We notice that this condi- 
tion concerns only the relative density function. 


3. Proof of result. Suppose the family F(z | @) is truncated onto a Borel set 
E, this means that 
, : Pr {xin E/N EZ, | F(x | 6)} 
(x in E, | F(z | 6) truncated to E} = ——___.., >: 
Pr {x in E, | F(x | 6) trundated to EZ} Pririn E]F@10)} 


If ¢z(x) is the characteristic function of E, which is =1 for x in € and =0 
otherwise, and if 


k(@) = Pr{zin E | F(xz| 6)} = / dF (x | @), 
E 


then the probability element of F(z | 6) truncated to F is 


c(0)/k(O)f (x | @)be(x) du(x) = c’(A)f(x | 6) dv(z), 


Ee 


- — 


ve oF ew UV 


= cer WY ™“ SS OO 


ON A PROBABILITY DISTRIBUTION 311 


where ¢’(0) = c(0)/k(@) and dv(~) = ¢z(x) du(x). Truncation has not changed 
the relative density function, and the result follows from the form of (1). 

Next suppose that, instead of accepting values with probability one in E 
and with probability zero outside EZ, we select according to a fixed Borel function 
¢(x), the chance of accepting a value x being ¢(x). The new family of distribu- 
tions has the same sufficient statistics for the same reason. 


REFERENCES 

fl] R. A. Fisner, “Theory of statistical estimation,’’ Camb. Phil. Soc. Proc., Vol. 22 
(1923-25), pp. 700-725. 

(2) R. A. Fisner, ‘The sampling error of estimated deviates together with other illustra- 
tions of the properties and applications of the integrals and derivatives of the 
normal error function,”’ Brit. Assn. Adv. Sci. Mathematical Tables, Vol. 1, xxvi-xxxv. 

3] H. Horeiuine, ‘Abstracts of Madison Meeting,’ Annals of Math. Stat., Vol. 19 (1948). 


{4] B. O. Koopman, ‘‘On distributions admitting a sufficient statistic,’’ Trans. Amer. 
Math. Soc., Vol. 39, pp. 399-409. 


(I 


ON A PROBABILITY DISTRIBUTION 
By Max A. WoopsBuRY 


University of Michigan 


1. Introduction. The problem treated is that of generalizing the Bernouilli 
distribution to the case where the probability of success is not constant from trial 
to trial but depends on the number of previous successes. The case where the 
probability of an event depends on the number of trials is easily handled and 
is not the case treated here. Several special cases of such a distribution have 
been worked out at one time or another. (E.g. C. C. Craig found the solution for 
one such special case and thus called the author’s attention to the problem.) 

The solution involves the Newton divided difference expansion of powers in a 
form which can be utilized for computation if the number of trials is not too 
large. In the case where the probabilities on a single trial are small an approxi- 
mation, (similar to that of the Poisson distribution to the Bernouilli distribution) 
ean be found. 

Applications can obviously be made to urn schema in which black balls are 
replaced, but white balls are removed. Similarly, applications can be made to 
the distribution of the number of plants in a given area. 


2. Solution of the problem. Specifically the problem is as follows: ““What is 
the probability that in n trials of an event it will occur z times presuming that 
the probability of the event on a given trial depends only on the number of 
previous successes?’ Denote by P(n, x) the probability of x successes in n 
trials and by p,z the probability of the event after z previous successes. As 





312 MAX A. WOODBURY 

conventional denote gz; = 1 — p, and one can formulate the following equation 
of partial differences: 

(1) P(in+1,2+ 1) = pzP(n, x) + ge4:P(n, x + 1). 


This equation is an obvious consequence of the statement that x + 1 successes 
in n + 1 trials can only occur if there are x successes in n trials and a success on 


the n + Ist or x + 1 successes in n trials and failure on the n + Ist. The 
boundary conditions appropriate are: 

(2) P(n, x) = Ofor x < 0,orz > n and P(0, 0) = 1. 

It is convenient and appropriate to generalize (1) while retaining the boundary 
conditions (2). The equation (1) will be obtained from the following equation | 
by setting g = 1: : 
(3) P(n+1,2 +1) = (q — q@)P(n, 2) + QeriP(n, x + 1). 
It will be noted for further reference at this point that: 
(4) P(n, 0) = 43 | 
and: : 
(5) P(n,n) = (¢ — go)(q — M1) -** (G — Qn-1). 

This last suggests a change of variable of the form: 

(6) P(n, x) = F(n, x)(q — qo)(q — m) +++ (@ — Qe). 


Upon substituting this expression in (3) one obtains a somewhat simpler equation 
with the same boundary conditions as (2). 


(7) F(in+1,2+1) = F(n, 2) + qeuiF(n, x + 1). 

Using the generating function: 

(8) G(x, ) = 20 F(n, 2)" 
n=z f 

one may obtain from (7), using the boundary conditions (2) the following 

ordinary linear difference equation: 


(9) Gv + 1, &) = Ga, —) + qeirG(x + 1, §)]. 
From (4) it is easily seen that: 

(10) GO, —) = 1/[1 — ql, 

and hence that the solution of (9) is: 

(11) G(a, £) = &/[(1 — gof)(1 — me) --- (1 — geé)). 


This may be expanded in partial fractions and the result written: 


(12) G(a, é) = fd qgi/I(qi — qo) «> (Qs — Qa) (Qi — Qin) «++ (Gi — Qe)(1 — Qié)). 





ion 


SES 


“he 


ion 


On 


A AT NEE 


_—) 


A TT TES SR ep ene 


Se a Nao 


SAMPLE SIZE DETERMINATION 313 


By means of the relation in (8) one deduces readily that: 


(13) F(n, x) = Da [gi — Go) +> (Qi — Giad(Qi — Gis) «+> (Qi — Ge). 


Jordan [1, p. 19, eq. (1)] shows this to be the zth Newton divided difference of g” 
where the expansion is in terms of (¢ — go) --- (¢ — qz), forz = 0,1, ---, 7. 
The solution for (3) can now be written as: 


(14) P(n, x) = (q — qo) ++ Y — Gea) Fa(@) 


from which follows: 
(15) dX P(n, x) = q”. 


As remarked before, by setting g = 1 one obtains the solution of (1) subject to 
the boundary conditions (2). 

It is clear that when all the q; are equal that the Bernouilli distribution should 
come out as a special case. Since in this case the divided difference becomes the 
corresponding derivative divided by the appropriate factorial, one obtains: 


(1 a qo)” d q” 
x! dq? \quao 





(16) P(n, x) = 


Upon reduction this yields the usual formula, but not in the usual way. 
By choosing pz = A./n and allowing n to increase without limit one obtains 
an analogue of the Poisson distribution, viz: 


(17) P(x)=(—Yo)- + -(—Az) Le /100-20): ** (Asa Ai) Aida Ai) - + Az—As)] 


which corresponds to the expansion of e about Xo, 1, A2, °°, Az, «+ WhendA =0. 


REFERENCE 


[1] CHarRLEs Jorpan, Calculus of Finite Differences, Chelsea Publishing Co., New York, 
2nd ed., 1947. 


ae cI ne 


A GRAPHICAL DETERMINATION OF SAMPLE SIZE FOR WILKS’ 
TOLERANCE LIMITS 


By Z. W. Brrnspaum AND H. 8S. ZucKERMAN 
University of Washington 


1. Summary. To determine the smallest sample size for which the mini- 
mum and the maximum of a sample are the 1008% distribution-free tolerance 
limits at the probability level e, one has to solve the equation 


(1) Ne* — (N — 198% =1—« 








314 Z. W. BIRNBAUM AND H. S. ZUCKERMAN 


given by 8. 8S. Wilks [1]. A direct numerical solution of (1) by trial requires 
rather laborious tabulations. An approximate formula for the solution has 
been indicated by H. Scheffé and J. W. Tukey [2], however an analytic proof for 
this approximation does not seem to be available. The present note describes 
a graph which makes it possible to solve (1) with sufficient accuracy for all 
practically useful values of 8 and e. 


2. Construction of the graph. Substituting in (1) 


a 
74 





x 


we obtain 


8 


l+ce=(1-.p >” 


and 


1 B l 
yr ( P = «<= —_-,___,, <anitutiigntciminaiiag cr. e 
(2) log (1 + x) log — + (, = log 3) 2 


To solve (2) graphically, one has to find the intersection of the curve 
(3) y = log (1 + 2) 


with the line 


a ate pt eh be the 
y= log pH + (4, toe 3) 


To prepare a graph on which this can be done, one first plots (3) once for al 
(Figure 1, Curve C). Then one marks the points — log i1-<% the y-axis 
and labels them with the values of e« (Figure 1, Scale I); chooses a constant r > 0 


, 1 j , 
and marks the points r log =~ on the z-axis (Figure 1, Scale IT); chooses a con- 
P l—e 


; B 1 , ; 
stant k > 0, marks the points kr ——— log= on the x-axis, draws vertical lines 


t- 9 ''e 
through each of these points, and labels them with the values of 6 (Figure 1, 
Scale III); draws the line x = k (Figure 1, line L); marks the uniform Scale IV 
on the z-axis. 
The graph reproduced here has been prepared with r = 4,k = 5. It can 
easily be verified that the instructions on the graph lead to solutions x of (2) and 


N=2 of (1). 





B 
'~. 


315 


SAMPLE SIZE DETERMINATION 


‘x YO pval! A] 9[vog 


‘qd yuiod 4e JJ] 


e-% 
9g 
uo Xx BSSIOSGB sey YOIyM yulod & 4B YQ VAUNO S}Nd su] SuIy9UUOD 9Y4 ff) YYIA J B[VIg UO » yOOUUOD (¢ 
‘Q quiod sry} [[Bo fg Jo ayeUIpsO OY} YIIA yuIod ayy 7 oul, UO 94890] (Z 
d]v9g UO gf POYyALUT OUT] [VITJIBA $}NO OUT] SITY} SOUT] YYSIVIYs B YIM J] V[BIg UO » puB | a[Vog ” 9 yoouU0d (T[ 
. 7—] = y9(I — NX) — en9N UOryenba oy} Jo UoTyN[Os oyeuTIxoidde UB puy OF, 


x = XN eyndulod (F 


T ‘SI 




































































ae2oc}: 2 > AD 
3S _ o —_— 
ae ~A A 8 cq _. om Ss 9g 








316 Z W. BIRNBAUM AND H. S. ZUCKERMAN 


3. Improvement by iterations. The graphical solution, usually accurate to 
two significant digits, may be improved easily by iterations. Replacing (2) 
by the equation 


— i ats a | 
(4) f= E (1 + 2) -- log As log 5) = f(x) 


one obtains iterations x;4,; = f(x,;) which, for .80 < e < .999 and .80 < B < .999° 
converge rapidly to the solution of (2). 


EXAMPLE. For e = .99, 8 = .999, one finds graphically x, = 6.6, and from 
: — log (1 + 2;) +2_,. kit 
(4) the iteration formula 2;4; = ae which yields the values 2. = 
J é 


6.642, x3 = 6.648, 2, = 6.649, x; = 6.649. Rounding up we obtain the sample 
size N = 6.649-999 = 6643. 

For ¢ and 6 between .80 and .999 all iterations obtained from (4) are on the 
same side of the exact solution and converge to it monotonically. Thus, in our 
example, from 2, < 22 we conclude that 2 as well as all further iterations are 
smaller than the exact solution. 


REFERENCES 


[1] S.S. Witks, Wathematical Statistics, Princeton University Press, 1943, p. 94. 
[2] H. Scuerr£ anp J. W. TuKey, ‘“‘A formula for sample sizes for population tolerance 
limits,’’ Annals of Math. Stat., Vol. 15 (1944), p. 217. 


ee 


—E 


ABSTRACTS OF PAPERS 
(Abstracts of papers presented at the New York meeting of the Institute on April 8-9, 1949) 


1. Adjustment of an Inverse Matrix Corresponding to a Change in One Ele- 
ment of a Given Matrix. Jack SHERMAN and WINiFRED J. Morrison, The 
Texas Company Research Laboratories, Beacon, New York. 


If one element, ars ,in asquare matrix A is changed by an amount Aags , all the elements 
b;; in the inverse matrix B are generally changed. A simple equation has been derived by 
means of which the elements );; in the resulting inverse matrix B’ can be computed directly 
in terms of Aars and the elements of B. The equation is 


bs; bir Aars 


bi; =>),;; —- ——_——- 
"7 1 + bspAars 


It follows that any given square matrix can be transformed into a singular matrix by 
increasing any one element in the transposed inverse matrix. 


2. The Distribution of the Number of Exceedances. E. J. GumBit, New York 
and H. von ScHELLING, Naval Research Laboratory, New London, Conn. 


The probability for the mth observation in a sample of size n taken from a population 
with an unknown distribution of a continuous variate to be exceeded zx times in N future 
trials is studied. The averages, moments, and the cumulative probability of the number 
of exceedances are calculated with the help of the hypergeometric series. The tolerance 
limits constructed by Wilks are special cases of the cumulative probability. The mean 
number of exceedances is the same as in Bernoulli’s distribution. In some cases there are 
two modes, namely m — 1 and m— 2. Ifn =, the most probable number of exceedances 
over the mth largest value is either m, or m — 1, and the median number of exceedances is 
equal tom —1. In 50% of all cases, the largest (smallest) of n past observations will not 
(always) be exceeded in n future observations. If n and N are both large and equal, the 
distribution of the number of exceedances over the median is normal whereas the distribu- 
tion of the extremes, similar to Poisson’s distribution, has a mean m, and a variance 2m. 
The variance of the number of exceedances is largest for the median, and smallest for the 
extremes of the previous sample. These distribution-free methods may be applied to 
meteorological phenomena, such as floods, droughts, extreme temperatures (the killing 
frost), largest precipitations, etc., and permit the forecasting of the number of cases sur- 
passing a given severity. 


3. Note on the Power Function of a Quality Control Chart. Leo A. ARorAn, 
Hunter College, New York. 


The power function of a quality control chart is given for a sequence of N sample points 
in terms of a andy, the probability of a Type I error and the power function respectively for 
asingle sample point. Two different models are considered and the generalization to two 
quality control charts is indicated. 


4. Tests Between Two Means or Regression Coefficients When Observations 
are of Unequal Precision. Urram CHanp, University of North Carolina, 
Chapel Hiil. 


Relative merits of different tests available for testing two means or two regression coef- 
ficients in relation to asymmetric and symmetric aspects of Student’s hypothesis in case 
of unequal population variances have been reconsidered. In this connection the distribu- 


317 








318 ABSTRACTS OF PAPERS 


tion of a certain quantity ¢, where k is some inexact value of the unknown ratio of variances 
has been obtained. The hypothesis of the equality of two linear regression functions in 
case of unequal residual variances has also been considered. 


5. Functional Expansions. EuGrnre W. Prkr, Boston, Massachusetts. 


This paper calls attention to a new type of estimation problem, arising both in the inter- 
pretation of experimental data from complex experiments, and in the design of analogue 
computers for functions of several independent variables. 

It has long been known, though not widely recognized, that the partial sums of rows and 
columns arising in the bivariate analysis of variance represent the least squares fit of a 
functional form [f(x) + g(y)] to a tabular function F(z, y) of two independent variables, for 
example. More recently, several people have realized gradually that independent causes 
may combine in much more complicated ways to produce a common effect, and that corre- 
spondingly more complicated functional combinations, such as [f(z) + g(y) + h(z)-k(y)], 
can be fitted by least squares to tabular functions of z and y. 

Examples of such expansions, as applied both to the design of computers and to the 
analysis of experimental data, will be given. 

This presentation is based on work supported by the Air Materiel Command, USAF. 


6. The Geometric Range for Distributions of Cauchy’s Type. E. J. GumBet, 
New York City, and R. D. Krerenry, Metropolitan Life Insurance Comany, 
New York City. 


From each of N samples of large size n the largest and the smallest values Xn, and Xi., 
(v = 1,2, --- N) are taken, where each X is measured from the central value of Nn observa- 
tions. The sample size must be so large that the probability of any extreme Xp.,and —X1., 
being negative may be neglected. The distribution of the geometric means p of the N pairs 
of extremes henceforth called geometric ranges, is derived under the assumption that the 
initial distribution is symmetric, unlimited and of the Cauchy type which implies that the 
moments of an order equal to, or larger than k(k > 0) diverge. Let uw be the expected larg- 
est value. Then the probability density of &, = 2u*p-* obtained from a theorem of Elfving 
(Biometrika, Vol. 35) is &Ko(t,) where Ko is a Bessel function. This permits calculation of 
all moments of & . Methods are given for estimating the parameters uandk. The distri- 
bution of the geometric ranges p is again a Bessel function. A probability paper is con- 
structed for testing the hypothesis that the initial distribution is of Cauchy’s type. A 
strict parallelism is established between the asymptotic distributions of the range for the 
exponential type, and of the geometric range for Cauchy’s type. This provides a criterion 
to which of the two types the initial distribution belongs. 


7. On Sums of Random Integers Reduced Modulo m. A. Dvorerzxy, Insti- 
tute for Advanced Study, Princeton and J. WoiFrowrrz, Columbia Univer- 
sity, New York City. 

Let Xn, (n = 1,2, --- ) be an infinite sequence of independent, integral-valued, chance 
variables, and let m be any fixed integer greater than1. Put S, = Doo X> and denote S,, 
reduced mod. m by Y, ;i.e., Yn is a random variable which assumes only the values j = 
1,2, +--+ ,m with respective probabilities P,(j7) = Prob {S, =j7 (mod. m)}. Necessary and 
sufficient conditions are obtained for Y, to be equidistributed in the limit, i.e., for lim,_, 


P,(j) = —. (j =1,2,-+::,m.) Some easily applicable sufficient conditions are deduced 
m 


4 as 
and the cases m = 2,3, 4 are studied in detail. The rapidity with which P,(j) — — is also 


studied 


a Se 


S—- = er 


ABSTRACTS OF PAPERS 319 


8. The Corpuscle Problem: Estimating the Surface-Volume Ratio of a Cor- 
puscle of Arbitrary Shape. JeRoME CorNFIELD, National Institutes of 


Health and Harotp W. CuHatxuey, National Cancer Institute, Bethesda, 
Md. 


Consider a space containing F, a closed figure of arbitrary shape, volume V and surface 
area S, Leta line segment of length r be thrown in the space in such a fashion that we have 
uniform distribution of the probabilities that the end point P occupies any position in the 
space and that the other end point P’ occupies any position on the surface of a sphere of 
radius r with center at P. Count the number of end points falling in F (0, 1 or 2 for a single 
throw), call it the number of hits, and denote it by h. Count the number of times the line 
intersects the surface (0, 1 or 2 times for a single throw for a non-reentrant figure, possibly 
more for a re-entrant one), call it the number of cuts and denote it by c. Then, it is proved 
that rE (h)/E(c) = 4V/S. This result is intended to provide a theoretical basis for esti- 
mating the surface-volume ratio of physical objects of any shape. 


9. Generalized Hit Probabilities with a Gaussian Target. D. A. S. Fraser, 
Princeton University. 


In the Supplement to the Journal of the Royal Statistical Society, Vol. 8 (1946), L. B. C. 
Cunningham and W. R. B. Hynd proposed a problem and gave an approximate solution cov- 
ering a partial range of parameter values: to find the probability that a moving target will 
survive a burst of ‘‘n’’ rounds from a rapid-firing gun, account being taken of correlation 
between the different points of aim. 

Generalizing from the case of a two dimensional target to ‘‘k’’ dimensions, this paper 
gives the probability for 0, 1,2, --- n hits, under the following assumptions: the ‘‘n’’ points 
of aim have a Multivariate Gaussian Distribution, the dispersion error has a Gaussian Dis- 
tribution, and the target is a Gaussian Diffuse Target, that is, the probability of a hit on a 
particular round as a function of the coordinates of the shell has the form of ‘‘a constant 
times a Gaussian probability density function.”’ 

Limiting distributions are obtained as n — ©, subject to a variety of limiting conditions. 

Numerical values for the probability of at least one hit are plotted when n = 5, for a 
range of values, relative to the target size, of dispersion and aiming errors. 


10. A New Continuous Sampling Inspection Plan Based on an Analysis of Costs. 
VY. E. SarrertTuwairte, General Electric Company, Bridgeport, Connecticut. 


Inspection, like all other industrial operations, must be run to produce the most return 
for the lowest cost. The costs include overhead and running inspection costs; complaint 
costs; rework and scrap costs; and the costs of unnecessary process rejections. Also one 
must consider the frequencies of occurrence of these costs. These include the process aver- 
age percent defective; the probability of occurrence of a complaint; and the frequency of 
occurrence of quality deteriorations. 

For continuous inspection, the percentage of the product to be inspected has a very 


simple formula: P = V SC, HM, where S is the sensitivity of the sampling plan used, C is 
the complaint cost, H is the effective inspection cost, and 1/M is the quality deterioration 
rate. 

It was also necessary to develop a new continuous sampling inspection plan which would 
be efficient over the entire range of continuous sampling applications. The plan presented 
is a sequential plan which, with suitable attention to details, is easily applied on the shop 


floor. The Dodge Plan is a special case and is efficient only in a small percentage of appli- 
cations. 








320 ABSTRACTS OF PAPERS 


11. On the Levels of Significance of the / and Beta Distributions. Lro A. 
ArRoIANn, Hunter College, New York. 


Two formulas are given for the determinations of the levels of significance of the F and 
3eta distributions. Inthe case of the F distribution a previous set of formulas (Biometrika, 
Vol. 34, pp. 359-360) is modified to give 3 significant figure accuracy, n; , m2 = 24. The set 
for the Beta distribution is of Cornish-Fisher type, p,q 2 6. The advantage of these over 
Paulson’s F formula and Carter’s z formula are the avoidance of the solution of a quadratic 
in the case of Paulson’s formula, and the avoidance of the exponential tables in the case of 
Carter’szformula. A short numerical table compares the three methods for selected values 
of nm, and n2. 


12. Certain Statistics for Sampies of 3 From a Rectangular Population, JuLius 
LIEBLEIN, National Bureau of Standards. 


A continuation of a study presented at the Madison meeting of the Institute of Mathe- 
matical Statistics last September. (For abstract see Annals of Math. Stat., December 1948, 
p. 595.) The previous paper derived properties of the statistics 

gz’ ie 2” z’ ao a” 2’ a x” 
y= n= —— ie = 
Y1 a ’ J2 9 ’ Y3 2 ? 





where 2 , ®2 , 3; are the observations, ordered by increasing size, in an independent random 
sample of three observations from a normal population, and z’ and x”, z’ 2 2”, are the two 
closest of the three. Inthe present paper distributions (joint as well as simple) are obtained 
for the above three statistics and also for z’’’, the remaining observation not included in 
the closest pair, for samples of 3 from a rectangular population, and a theorem is proved 
concerning the distribution of y; for a wide class of continuous populations. 


13. The Choice of Lot Inspection Plans of the Basis of Cost. F. E. Satrer- 
THWAITE, and Burton Grap, General Electric Company, Bridgeport, Con- 
necticut. 


An extension of the first paper to single sampling inspection plans. The important con- 
cepts involved are the break-even quality level, the operating ratio, and the weighted prior 
odds that a lot is a good lot. Charts are being prepared which can be entered with simple 
functions of the costs and which give directly the sample size and acceptance number for 
the most efficient single sampling inspection plan. 

It appears promising that the method can be extended to double and sequential sampling 
plans. This is imperative because of the large portion of the time that ‘‘no-inspection”’ is 
the most efficient single sampling plan. 


IRIE A 


a / . 


—s 


<a SOEREC aa A  TI 


NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest 


Personal Items 


Enrique Loizelier Blanco, Professor of Statistics in the University of Madrid, 
has just finished the first year of experimentation in Quality Control Methods 
in different plants. The interest for these new statistical applications started 
in Spain during 1946 and have increased rapidly since then, especially this year 
after consecutive bimonthly intensive courses which Professor Blanco has been 
teaching. 

Mr. Osmer Carpenter, formerly an Instructor in the Department of Statistics 
and Mathematics at Iowa State College is now doing statistical work for Carbide 
and Carbon Chemical Corp., Oak Ridge, Tennessee. 

Dr. K. L. Chung, formerly of Princeton University, has been appointed to an 
assistant professorship at Cornell University. 

Dr. Clyde H. Coombs, Associate Professor of Psychology and Chief of Re- 
search Division, Bureau of Psychological Services at the University of Michigan, 
is on leave of absence for the academic year to work at Harvard University on 
problems of scaling. 

Dr. Meyer A. Girshick, formerly with the Douglas Aircraft Co., Santa Monica, 
California, has accepted a professorship in the Department of Statistics, Stanford 
University, Stanford, California. 

Dr. M. J. Gottlieb, who has been with the Institute for Advanced Study at 
Princeton, has been appointed to an assistant professorship at the Newark 
College of Rutgers University. 

Associate Professor E. H. C. Hildebrandt of Northwestern University has 
been elected President of the National Council of Teachers of Mathematics. 
He is also National Secretary-Treasurer of Pi Mu Epsilon and Secretary of the 
Mathematics Section of the Central Association of Science and Mathematics 
Teachers. 

Dr. C. A. Hollingsworth, formerly with the Acetate Section of the DuPont 
Company, is now an instructor in the Department of Chemistry, University of 
Pittsburgh. 

Professor William G. Madow, who has been with the Institute of Statistics 
at the University of North Carolina, has been appointed Professor of Statistics 
at the University of Illinois. 

Dr. Zenon Szatrowski, formerly teaching in the Economics Department of 
Northwestern University, has accepted an associate professorship in the Depart- 
ment of Economics, University of Oregon, Eugene, Oregon. 

Mr. Eric Wey] has resigned his position as staff engineer in the Chicopee Manu- 
facturing Corporation and is now conducting his own business as a textile en- 
gineering consultant in Manchester, New Hampshire. 

321 








322 NEWS AND NOTICES 


New Members 


The following persons have been elected to membership in the Institute (December 1, 

1948 to February 28, 1949). 

Abruzzi, Adam, M.S. (Columbia Univ.) Student in engineering at Columbia University, 
22 W. 107th Street, Shanks Village, New York. 

Agarwal, Satya P., M.A. (Agra Univ., India) Student at University of California, Inter- 
national House, Berkeley 4, California. 

Anderson, Robert W., M.A. (Columbia Univ.) Student at Columbia University, 21428-11: 
Road, Queens Village 9, New York. 

Bahadur, R.R., M.A. (Univ. of Delhi, India) Graduate Student at University of North 
Carolina, Chapel Hill, North Carolina. 

Blom, Gunnar, Fil.kand. (Stockholm) Olof Skotkonungs vag 8, Aspudden, Sweden. 

Burrows, Glenn L., M.A. (Michigan State College) Research Associate, P.O. Box 168, 
Institute of Mathematical Statisties, Chapel Hill, North Carolina. 

Chapman, Carlos A., Jr.. M.S. (Univ. of Michigan) Sales Statistician, Argus, Inc., Ann 
Arbor, Michigan, 834 W. Huron St., Ann Arbor, Mich. 

Chiang, Chin Long, M.A. (Univ. of Calif.) Student at the University of California, 336-A 
Panoramic Way, Berkeley 4, California. 

Coggins, Paul B., M.S. (Univ. of Wisconsin) Graduate Teaching Assistant, University 
of Michigan, University; Club, Madison 5, Wisconsin. 

Crapsey, Marcus T., A.B. (Univ. of Michigan) Graduate student at the University of 
Michigan, 615 Monroe, Ann Arbor, Michigan. 

Coy, John W., M.A. (Univ. of New Mexico) Teaching Fellow, Department of Mathe- 
matics, University of Michigan, 2644 Whitewood, Ann Arbor, Michigan. 

Cutkosky, Richard E., Student at Carnegie Institute of Technology, Box 401, Carnegie 
Institute of Technology, Pittsburgh, Pennsylvania. 

DeiPriore, Francis R., B.A. (New York Univ.) Associate Statistician, U. 8. Naval En- 
gineering Experiment Station, 2609-22nd. Street, N.E., Washington 18, D.C. 

Desind, Philip, M.S. (College of City of N. Y.) Statistician, Bureau of Ships, Navy 
Department, Washington, D. C., 7418 Georgia Ave., N.W., Washington, D.C. 

Dutka, Solomon, M.A. (Columbia Univ.) Chief Statistician, “% Elmo Roper, 30 Rocke- 
feller Plaza, New York City, New York. 

Dwass, Meyer, B.A. (George Washington Univ.) Graduate student at Columbia Uni- 
versity, Apt. 3A, 609 W. 115 St., New York, New York. 

Eastman, Walter F., A.B. (Harvard) Central Technical Department, The American 
Brass Co., Waterbury, Connecticut. 

Eisenpress, Harry, B.A. (College of City of N. Y.) National Bureau of Economic Re- 
search, 1819 Broadway, New York 23, New York, 2935 Ocean Parkway, Brooklyn 24, 
New York. 

Fellows, Clifford Martin, B.S. (Boston Univ.) Assistant Instructor, Boston University, 
Bureau of Research and Statisties, 685 Commonwealth Avenue, Boston 15, Massachusetts. 

Gowen, John W., Ph.D. (Columbia Univ.) Professor of Genetics, Genetics Department, 
Iowa State College, 2014 Kildee, Ames, Iowa. 

Greenwood, Robert E., Ph.D. (Princeton Univ.) Assistant Professor of Applied Mathe- 
maties, University of Texas, 1704 Windsor Road, Austin, Texas. 

Hald, Anders, Ph.D. (Univ. of Copenhagen) Professor of Statistics, University of Copen- 
hagen, Emdrupvenge 94, Copenhagen 0, Denmark. 

Helms, William R., Student at Ohio State University, Stadiwm Club, Ohio State University, 
Columbus 10, Ohio. 

Hemphill, F.M.,M.S.Ph. (Univ. of Michigan) Major, U.S. Public Health Service, School 
of Public Health, University of Michigan, Ann Arbor, Michigan. 


¢ 


NEWS AND NOTICES 323 


Himes, Harold W., B.S. (George Pepperdine College, Los Angeles) Statistician, Test 
Design and Analysis Section, U.C.D.W.R., U. S. Navy Electronics Laboratory, San 
Diego 52, California. 

Hutchinson, L. Charles, Ph.D. (Mass. Institute of Tech.) Associate Professor of Mathe- 
matics, Polytechnic Institute of Brooklyn, Brooklyn, New York. 

Klahr, Carl N., M.S. (Carnegie Institute of Tech.) Student, Atomic Energy Commission 
Fellow, Carnegie Institute of Technology, 6387 Phillips Avenue, Pittsburgh 17, Penn- 
sylvania. 

Kraemer, Herbert F., B.S. (Univ. of Delaware) Statistical Engineer, Technica! Super- 
visor, Commercial Solvents Corporation, Terre Haute, Indiana, 1514 South 7th St., 
Terre Haute, Indiana. 

Kuebler, Roy R., Jr., A.M. (Univ. of Pennsylvania) Associate Professor of Mathematics, 
Dickinson College, Carlisle, Pennsylvania. 

Lafontant, Herne E., M.S. (Atlantic Univ.) Student at the University of Michigan, 
615 Monroe, Ann Arbor, Michigan. 

Lal, Dip Naravan, Ph.D. (Edinburgh Univ.) Lecturer in Mathematics, Patna University, 
New Dak Bungalow Road, Patna, Bihar, India. 

Liserre, Guido Orlando G., Profesor de Estadistica, Mendoza 2540, Rosario, R., Argentina. 

Matson, J. H., B.A. (Univ. of Wisconsin) Statistician, Baker Manufacturing Company, 
Evansville, Wisconsin. 

Monsch, Henry D., B.S. (Missouri School of Mines & Metallurgy, Rolla) Metallurgist, 
Aluminum Company of America, Fabricating Division, Aleoa, Tennessee, 5507 Lake 
Shore Drive, Knorville, Tennessee. 

Moore, Lucius T., Ph.D. (John Hopkins Univ.) Associate Professor, Department of 
Mathematics, Brooklyn College, 205 Hicks Street, Brooklyn, New York. 

Noack, Albert, Ph.D. (Kiel, Germany) Privatdozent, Studienrat, (24a) Hamburg- 
Lokstedt II, Tibarg 26, Germany. 

Patton, Robert E., A.B. (N.Y.State Teachers College, Albany) Graduate student at the 
University of Michigan, 522 Linden St., Ann Arbor, Michigan. 

Potter, Muriel, Ph.D. (Columbia Univ.) Instructor in Psychological Foundations, Edu- 
cational Research and Reading Supervisor, Teachers College, Columbia University, 
414 Riverside Drive, New York 25, New York. 

Putz, Robert R., B.A. (Univ. of Minnesota) Teaching Assistant, Department of Mathe- 
matics, University of California, 1631 Cornell Avenue, Berkeley 2, California. 

Ratoosh, Philburn, M.A. (Columbia Univ.) Assistant in Psychology, Department of 
Psychology, Columbia University, New York 27, New York. 

Richardson, Wyman, Jr.,S.B. (Harvard) Graduate student at the University of North 
Carolina, 208-B, Chapel Hill, North Carolina. 

Rosenbaum, Sidney, M.A. (Cambridge) Scientific Officer, Ministry of Works, 31, Multon 
House, Shore Place, London E.g., England. 

Savage, I. Richard, M.S. (Univ. of Michigan) Student at Columbia University, 1414 
John Jay Hall, New York 27, New York. 

Sheerin, Gail, A.B. (Univ. of Rochester) Statistical Technician, A.E.C. Project, Uni- 
versity of Rochester, 1091 Highland Avenue, Rochester, New York. 

Siegert, Arnold J. F., Ph.D. (Leipzig, Germany) Professor of Physics, Department of 
Physics, Northwestern University, Evanston, Illinois. 

Simpson, Paul B., Ph.D. (Cornell Univ.) Assistant Professor of Economics, Department 
of Economics, Stanford University, California. 

Solem, Anson D.,M.S. (Harvard Univ.) Chief of Fragmentation Section, Naval Ordnance 
Laboratory, White Oak, Maryland, 121 Galveston St., S.W., Washington 20, D.C. 

Sorensen, Frederick A., B.S. (Carnegie Institute of Tech.) Teaching Assistant in Mathe- 


matics, Carnegie Institute of Technology, 1204 East End Avenue, Pittsburgh 18, Penn- 
sylvania. 








324 NEWS AND NOTICES 


Steel, Robert G. C., M.A. (Acadia Univ., Canada) Instructor and Research Associate: 
Statistical Laboratory, Iowa State College, Ames, Iowa. 

Taylor, Francis B., A.M. (Columbia Univ.) Instructor in Mathematics, Manhattan 
College, New York and Graduate student at Columbia University, 345 L. 193 St., Bronx 
58, New York. 

Terrell, James R., A.B. (Univ. of Michigan) Statistical Clerk, Research Center for 
Group Dynamics, P.O. Bor 357, Ann Arbor, Michigan. 

Tick, Leo J., B.S. (Iowa State College) Research Graduate Assistant, Statistical Labora- 
tory, Iowa State College, Ames, Iowa. 

Tyler, Sylvanus A., S.M. (Univ. of Chicago) Associate Mathematician (Biometrics), 
Argonne National Laboratory, P.O. Box 5207, 9059 So. Stewart Avenue, Chicago 20, 
Illinois. 

Tysver, Joseph B., M.A. (Washington State College) Teaching Fellow, University of 
Michigan, 1404 Erving Court, Willow Run Village, Michigan. 

Umarji, Raghavendra R., A.M. (Columbia Univ.) Lecturer in Mathematics, Bombay 
Educational Service, 509 John Jay Hall, Columbia University, New York 27, New York. 

Wilburn, A. J., A.B. (Howard Univ.) Statistician, Civil Aeronautics Board, Washington, 
D. C., 25-46th Place., N.E., Washington, D.C. 


(I 


Correction 


The information following Paul Koditschek’s name which appeared in the March issue 
of the Annals, page 149, should have appeared as follows: 


Koditschek, Paul, Ll. D. (Univ. of Vienna) Research Associate, Scientific Research Service, 
319 W. 13th Street, New York 14, New York. 


(It was implied in the original notice that Scientific Research Service is connected with 
Columbia University.) 


(ne en eR a 


News Item from Cornell 


With the continued support of a research contract with the Office of Naval Research, the 
Mathematics Department of Cornell University is further expanding research and instruc- 
tion in the theory of probability and its applications. At present Professors Feller, Kac, 
Chung and Dr. Donsker are participating in the work. Professor G. Elfving of the Uni- 
versity of Helsingfors has been appointed Visiting Professor of Mathematical Statistics 
for the academic years 1949-1951. Professor J. L. Doob, on sabbatical leave from the Uni- 
versity of Illinois, will spend the year 1949-50 at Cornell. Dr. Gilbert Hunt has been ap- 
pointed Assistant Professor of Mathematics. 


REPORT ON THE NEW YORK MEETING OF THE INSTITUTE 


The thirty-eighth meeting of the Institute of Mathematical Statistics was 
held at Columbia University, New York City on Friday afternoon and Saturday, 
April 8-9, 1949. The meeting was attended by 93 persons including the follow- 
ing 80 members of the Institute: 


A. Abruzzi, T. W. Anderson, Leo A. Aroian, Robert Bechhofer, A. A. Bennett, Joseph 
Berkson, Allan Birnbaum, C. I. Bliss, Paul Boschan, P. G. Carlson, Uttam Chand, Yunien 
Chen, E. P. Coleman, T. F. Cope, Jerome Cornfield, L. M. Court, M. I. Cropsen, J. H. Cur- 
tiss, Cuthbert Daniel, F. R. Del Priore, W. E. Deming, J. A. Dudman, David Durand, 
C. W. Dunnett, A. Dvoretzky, P. S. Dwyer, Churchill Eisenhart, H. L. Edgett, Harry 
Eisenpress, Lillian R. Elveback, D. A. S. Fraser, Murray Geisler, L. A. Goodman, J. I. 
Griffin, C. C. Grove, E. J. Gumbel, Miriam 8. Harold, Mina Haskind, L. H. Herbach, Harold 
Hotelling, Cuthbert Hurd, Arthur Kaufman, Roger D. Keeney, Paul Koditschek, Carl F. 
Kossack, Howard Levene, Jack Laderman, I. D. Lorge, C. L. Marks, Paul Meier, Frederick 
Mosteller, E. B. Mundie, C. M. Mottley, I. U. Mulk, Paul Neurath, G. E. Noether, Doris 
Newman, M. L. Norden, E. W. Pike, J. K. Perrin, H. M. Rosenblatt, Frank Saidel, William 
Salkind, F. E. Satterthwaite, Richard Savage, Henry Scheffé, H. L. Seal, Jack Sherman, 
Rosedith Sitgreaves, J. H. Smith, J. J. Sodano, Herbert Solomon, Mary N.Torrey, J. W. 


Tukey, S. S. Wilks, D. F. Votaw, Helen M. Walker, Lionel Weiss, Jack Wolfowitz and 
W. W. Wryht. 


The Friday afternoon session consisted of a Symposium on Applications of 
Multivariate Analysis, Professor 8. 8. Wilks of Princeton University presiding. 
The following two invited papers were given: 


1. Tests of Differences in Composite Growth Measurements in Pig Feeding Trials, J. Wishart, 
Cambridge University and University of North Carolina. 


2. Fields of Application of Multivariate Analysis, Harold Hotelling, University of North 
Carolina. 


The prepared discussion was presented by Professor S. N. Roy, Presidency 
College, Calcutta, and Columbia University, followed by discussion from the 
floor. 

The Saturday morning sesssion was opened by a business meeting, Dr. 
Churchill Eisenhart, National Bureau of Standards, presiding. Among other 
items of business the Constitution of the Institute was amended to provide for 
Institutional Membership, and the by-laws amended to specify the status and 
privileges of Institutional Members. The revised Constitution and By-Laws 
appear elsewhere in this issue. 

The second part of the session, Dr. W. Edwards Deming presiding, was 
devoted to an invited address: Non-Linear Regression Laws and “Internal Least 
Squares,’ by Dr. H. O. Hartley, University College, London and Princeton 
University. 

At the Saturday afternoon session, Professor Henry Scheffé, Columbia Uni- 


325 





326 REPORT ON NEW YORK MEETING 


versity, presiding, the following contributed papers were presented, ten in 
person, three by title: 


1. Adjustment of an Inverse Matrix Corresponding to a Change in One Element of a Given 
Matriz. 
Jack Sherman and Winifred J. Morrison, The Texas Company Research Laboratories, 
Beacon, N. Y. 
. The Distribution of the Number of Exceedances. 
EK. J. Gumbel, New York, N. Y., and H. von Schelling, Naval Research Laboratory, 
New London, Connecticut. 
3. Note on the Power Function of a Quality Control Chart. 
Leo A. Aroian, Hunter College. 
4. Tests between Two Means or Regression Coefficients When Observations Are of Unequal 
Precision. 
Uttam Chand, University of North Carolina. 
5. Functional Expansions. 
Eugene W. Pike, Boston, Massachusetts. 
6. The Geometric Range for Distributions of Cauchy’s Type. 
E. J. Gumbel, New York, N. Y., and R. D. Keeney, Metropolitan Life Insurance Com- 
pany, New York, N. Y. 
. On Sums of Random Integers Reduced Modulo m. 
A. Dvoretzky, Hebrew University, Jerusalem, and Institute for Advanced Study, and 
J. Wolfowitz, Columbia University. 
8. The Corpuscle Problem: Estimating the Surface-Volume Ratio of a Corpuscle of Arbitrary 
Shape. 
Jerome Cornfield, National Institute of Health, and Harold W. Chalkey, National 
Cancer Institute, Bethesda, Maryland. 
9. Generalized Hit Probabilities with a Gaussian Target. 
D. A. S. Fraser, Princeton University. 
10. A New Continuous Sampling Inspection Plan Based on an Analysis of Costs. 
F. E. Satterthwaite, General Electric Company, Bridgeport, Connecticut. 
11. On Levels of Significance of the F and Beta Distributions. (By title) 
Leo A. Aroian, Hunter College. 
12. Certain Statistics for Samples of 3 from a Rectangular Distribution. (By title) 
Julius Lieblein, Statistical Engineering Laboratory, National Bureau of Standards. 
13. The Choice of Lot Inspection Plans on the Basis of Cost. (By title) 
F. E. Satterthwaite and Burton Grad, General Electric Company, Bridgeport, Con- 
necticut. 


bo 


~] 


On Friday evening a dinner was held at the Men’s Faculty Club. 
S. B. LirravEr 
Assistant Secretary 





CONSTITUTION OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


ARTICLE 1 
PURPOSE 


The Institute of Mathematical Statistics is a society for encouraging the 
development, dissemination, and application of mathematical statistics. 


ARTICLE 2 
MEMBERS 


The Institute shall have Members and Institutional Members. Applications 
for membership must be approved by the Council. The Council may delegate 
this authority. 

Except for nonpayment of dues, no Member or Institutional Member shall be 
expelled or suspended except by three-fourths vote of the Council. 


ARTICLE 3 
OFFICERS 


The Officers of the Institute shall be the President, the President-Elect, the 
Secretary, the Treasurer, and the Editor. The terms of office of the Secretary, 
the Treasurer and the Editor shall be three years. ‘The terms of office of the 
President and the President-Elect shall be one year. The President-Elect shall 
succeed the President in that office. If the President is incapacitated, the 
President-Elect shall act as President, or, in case the President-Elect is also 
incapacitated the Secretary shall so act. Incapacity shall be determined by the 
Council. 

The President shall act as chairman of the Council and of the Executive Com- 
mittee, and shall appoint the Committees and representatives of the Institute, 
with the exception of the Committee on Fellows and the Executive Committee. 
Such Committee appointments shall be for terms of not more than three years, 
provided that committee appointments extending beyond the current year shall 
be either to standing committees with regularly rotating membership, er to 
temporary committees assigned specific tasks. 

The Treasurer shall present financial statements to the Council and shall bring 
condensed statements to the attention of the Members. 

The Secretary shall record the actions of the Council and of the Executive 
Committee and of Institute meetings, arrange for and inform the Members of 
meetings and conduct the correspondence of the Institute except as otherwise 
assigned by the Executive Committee. The Secretary may appoint Assistant 
Secretaries to assist him in connection with specified meetings or for other 
occasions. The offices of Secretary and Treasurer may be combined. 

327 





CONSTITUTION AND BY-LAWS 


ARTICLE 4 
CouNCIL 


The Council shall consist of not less than twelve elected members in addition 
to the Officers of the Institute except that vacancies in the Council occurring 
subsequent to an election shall not be filled until the next annual election. 

Elected members shall be elected for terms of three years, the terms of approxi- 
mately one-third of them terminating each year. 

The Council, representing the Members, shall determine the policies and 
supervise the affairs of the Institute in accordance with any Bylaws the Institute 
may adopt. It shall determine the standing committees of the Institute and 
the number of elected members of the Council. 

The Council shall elect the Secretary, the Treasurer, and the Editor, by 
majority vote. The Council shall determine the number, if any, of Associate 
Secretaries, Associate Treasurers and Associate Editors. The Secretary shall 
nominate Associate Secretaries, the Treasurer shall nominate Associate 
Treasurers, and the Editor shall nominate Associate Editors which the Council 
may elect by majority vote. Such Associate Secretaries, Treasurers, and 
Editors shall be non-voting members of the Council. 

The Council shall meet at least twice a year, usually at times of meetings of 
the Institute, and otherwise at the call of the President or the call of any five 
members of the Council. Any voting member unable to be present may appoint, 
in writing, a representative to speak for him, and such representative shall be 
entitled to vote. A quorum shall be seven persons entitled to vote. Majorities 
and other fractions of the Council are to be based on the number of persons 
present and entitled to vote. 


ARTICLE 5 
EXECUTIVE COMMITTEE 
The Officers shall constitute the Executive Committee of the Council, and 


shall conduct the affairs of the Institute. 


The Executive Committee may create temporary committees with assigned 
tasks coming within the scope of the Institute. 


; ARTICLE 6 
NOMINATIONS 


The President shall appoint a Nominating Committee and shall announce 
their names at the annual meeting when he retires from office. This Committee 
shall submit to the Members, through the Secretary and at least sixty days 
before the closing of polls at the next succeeding annual meeting, one nomination 
for President-Elect and a slate containing at least twice as many names as there 
are vacancies on the Council. 

Additional nominations may be made for President-Elect or for the Council 
by a petition signed by twenty Members. Such nominations shall appear on 





CONSTITUTION AND BY-LAWS 329 


the ballot if they are in the hands of the Secretary at least 30 days before the 
closing of polls at the next succeeding annual meeting. In any event, Members 
may vote for names in addition to those nominated. 


ARTICLE 7 
FELLOWS 


The Council, may, by majority vote, elect to fellowship any Member nomi- 
nated by the Committee on Fellows. Such nomination and election shall be on 
the basis of the nominee’s contributions to the development, dissemination, and 
application of mathematical statistics. 

ARTICLE 8 
CoMMITTEE ON FELLOWS 

The Council shall elect two Fellows annually to serve for three years on the 
Committee on Fellows. One of the Members whose term is next to expire shall 
be designated by the President as chairman. 

ARTICLE 9 
PUBLICATIONS 

The Annals of Mathematical Statistics shall be the official journal of the Insti- 
tute. Other publications may be authorized by the Council. 

The publications of the Institute shall be supervised by the Editor, with the 


assistance of the Associate Editors and such committees as the Council may 
approve. 


ARTICLE 10 


COMMUNICATIONS 


Public announcements concerning the Institute, including statements of policy, 
recommendations, reports of committees and accounts of Council meetings shall 
be issued by the Secretary or the President with the prior approval of the Council 
or its Executive Committee. Advance publicity concerning meetings may be 
released by authorized Program Committees or Publicity Committees. 


ARTICLE 11 
AFFILIATION 


By a three-fourths vote, the Council may authorize the affiliation of the 


Institute with any organization whose aims are consistent with those of the 
Institute. 


ARTICLE 12 
AMENDMENTS 


This constitution may be amended by an affirmative two-thirds vote of those 
Members voting at any regularly convened meeting of the Institute provided 








330 CONSTITUTION AND BY-LAWS 


notice of such proposed amendment shall have been sent to each Member by 
the Secretary at least thirty days before the date of the meeting at which the 
proposal is to be acted upon. Members may vote in person or by mail. The 
Secretary shall send to the Members any amendments recommended by the 
Executive Committee or proposed through a petition of 25 members of the 
Institute. 


ARTICLE 13 
IEeMERGENCIES 


[In an emergency, as determined by the President or the Executive Committee, 
or by a majority of the Council, a meeting of the Council to transact business 
or a meeting of the Institute to amend the constitution may be conducted by 
mail, 


BY-LAWS OF THE INSTITUTE OF MATHEMATICAL STATISTICS 
ARTICLE 1 
DuTIES OF OFFICERS 


The President, or in his absence the President-Elect, or in his absence a Mem- 
ber appointed by the Executive Committee, shall preside at business meetings 
of the Institute. 

The Treasurer shall send out calls for annual dues, pay all bills for expenditures 
authorized by the Institute, Council, or Executive Committee; keep a detailed 
account of all receipts and expenditures; prepare a financial statement at the 
end of each fiscal year and present an abstract of same at a business meeting of 
the Institute after it has been audited by a Member or Members appointed by 
the President, to whom such Member or Members shall report. 

The Secretary shall, subject to the direction of the Council, have charge of 
the archives and other tangible and intangible property of the Institute and shall, 
upon the direction of the Council, publish a classified list of all Members of the 
Institute, and of Institutional Members at their request. 

The Editor, subject to the direction of the Council, shall have charge of all 
editorial matters, whether relating to the official Journal or to other publications. 
He shall, with the advice and consent of the Council, appoint an Editorial Com- 
mittee of not less than twelve Members to cooperate with him for definite terms. 
All appointments to the Editorial Committee shall terminate with the appoint- 
ment of a new Editor. 


ARTICLE 2 
DUES 


Members shall pay seven dollars at the time of admission to membership and 
shall receive the full current volume of the official Journal. Thereafter Members 
shall pay seven dollars annua! dues, of which five dollars shall be for a subscrip- 
tion to the Official Journal. There shall be the following exceptions: 


Eg eS 


CONSTITUTION AND BY-LAWS 331 


A. Two Members of the Institute who are husband and wife may elect to 
receive one copy of the Official Journal between them, when their dues 
shall each be reduced by twenty-five percent. 

B. Any Member may make a payment in place of all succeeding annual dues 
based on a suitable table and rate of interest specified by the Council. 

C. Any Member on active military duty may notify the Treasurer that he 
wishes neither to pay dues nor to receive the Official Journal during the 
current year. He may receive the official Journal for the suspended 
years on payment of one-half of the suspended dues within one year after 
resuming payment of annual dues. 

D. Any Member who resides outside the Western Hemisphere shall pay five 
dollars annual dues. 

Institutional Members shall pay annual dues of at least $100. For each $100 
of annual dues, an Institutional Member shall receive two copies of the Official 
Journal, one bound, and shall be entitled to designate one person to have the 
full prerogatives of a member without further payment of dues (including the 
receipt of a personal copy of the Official Journal). Twenty-five dollars of each 
$100 shall be allocated to the three subscriptions to the Official Journal and the 
binding of one copy. 

*@ Annual dues shall be payable on the first day of January of each year. 

hYlt shall be the duty of the Treasurer to notify by mail anyone whose dues are 
six months in arrears, enclosing a copy of this article. If such person fails to 
pay such dues within three months from the date of mailing such notice, the 
Treasurer shall report the delinquent to the Council, who may suspend the 
delinquent from membership and who may reinstate the delinquent upon pay- 
ment of arrears. 


ARTICLE 3 
SALARIES 
The Institute shall not pay a salary to any Officer, Councilor, or member of 
any committee. 
ARTICLE 4 
AMENDMENTS 


These Bylaws may be amended in the same manner as the Constitution or, 
if the proposed amendment has been previously approved by the Council, by a 
majority vote at any regularly convened meeting. 





JOURNAL OF THE AMERICAN 
STATISTICAL ASSOCIATION 
DECEMBER, 1948 


Articles 


Commercial Uses of Sampling......J. Stevens Stock anp JoseEpH R. Hocustim 
Variation of the Frequency of Fatal Quarrels with Magnitude. Lewis F. RicHarDsoNn 
Bank Reserves and Business Fluctuations CLARK WARBURTON 
The Ordering of n Items Assigned to k Rank Categories by Votes of m Individuals 

GarRRET L. SCHUYLER 

Levels of Significance for Variance Ratio of Two Samples of Equal Size 
C. J. KrrcHEen 
..D. J. FINNEY 
.. ALBERT H. BOWKER 

The War Production Board’s Statistical Reporting Experience, Part IV 
Davip Novick AND GEORGE A. STEINER 
Correction to “On Estimating Precision of Measuring Instruments and Product 
Variability”’ .FRANK E. GRUBBS 
Statistical Methodology Index..........................005: Oscar KrisEN Buros 


AMERICAN STATISTICAL ASSOCIATION 
1603 K Street, N. W., Washington 6, D. C. 


MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liter- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 
Mathematical Society, Union Matematica Argentina, and others. 


Subscriptions accepted to cover the calendar year only. 
Issues appear monthly except July. $20.00 per vear. 


Send subscription order or request for sample copy to 


AMERICAN MATHEMATICAL SOCIETY 
531 West 116th Street, New York City 27 





wy 


